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GENERATION AND SELECTION OF RECOMBINANT 
VARIEGATED BINDING PROTEINS 



^^^^^-^^^'^^^'^T^ to development of novel 

This inventxon relates ^° ,3 ,f 

amplification. 

^y^^^MMiMl inn ni-n— ^^^f-ement 

..e amino acid sec^ence of a protein determines 
. ■ , r-xm structure, which m ™™ 
its three-ai-ens.on.1 '^""^jj^^jjjbzj, . The 



10 



15 



* , T%-rn-t-pin is essentially 
The 3D structure of a protein ^ ^ 

..„rte. . th. .e„t.t. - at^so^^^ 
! variety is allowed have the amxno acid 

P169-171 and emu, P239-245, 314-315). 

The secondary structure (helices, sheets turns 
, r nf a protein is determined mostly by local 
30 loops) of a protein correlated 

re-rtain amino acids tend to oe 
:"ain structures an. the co^only usea 

Tu- In ,,m2^^m2^^-^^\ s 
these correlations. However, every aB.no ao.d type has 
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the conformations of the pern^ap e 
5 different (FT^bssa , ARG0§2) . 

...ns and loops tolerate insertions and deXetions 
fliiv than do other secondary structures 

10 in loops and turns. 

Chan,in, t^rea residues in subtilisin fro. 

—...in, jt:,^\"r jr.; ^ 

o'anisB, S2 difference, regained in the se-juences. 

three residues changed were chosen -^^^^J^^ 
„ere the only differences within 7 Angstroms (A, of the 
20 active site {WSMl^) • 

SchuU and schir^er su-arize .any ohservatio^ on 
binding of proteins to other molecules (S£fflE2, 
eUle. haemoglobin alpha chains bind 
35 vlL tiUxy to haemoglobin beta chains (delta 0 more 
" :Z^ll than -11.0 KcaVmole, , antibodies -f^^ 
to antigens (%s range from lo'^. to 10 " «, 
dlssociltion constant equal to [AUB]/CA:B]) , bas.c 
bovine pancreatic trypsin inhibitor (BPII) h.nds 
30 tightly to trypsin (I^ = 6.0 x lO"" H ■ ^elta 

0 = -IS.O Kcal/mole, , and avidin binds to bxotin (Ka - 

binding results from complementarity of -rfa-s 
that come into contact: bumps fit into holes, unlike 
35 ™= -me together, dipoles align, and hydrophobic 
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a. ^ Although bulk 
.^oms contact other hydrophobic atoB • 

molecules are 

frequently £=und '''""J hydrogen bonds to 

l„.erfaoe., these 7;^J,,„ to other bouna 

one or Bore atoms of the P 

water. 

factors ^^^^-^-j::-:^^-::::^^ 

,Sa2E2a.^aSES^^=a=^' ^'Lrv surfaces proved 
designing new ^^Pl-"^"""^ developed lor 

diffioult. Although so.e ^^^^ ^^^^^^ 

substituting side 9-°"f '^^j to predict what 
proteins are floppy and xt J 

conforBation a new . ^^^'/J^^,^ ..lecules are all 
forces that bind proteans ^^J^^^^^ ^^.^.t the 
relatively weaX and it ^= difficult to 

Effects of '>'-;^°^„7^„,,i„, 'hased on theory alone 
design superior binding pr 

(QUI087) . 

,v,strate affinity, however, has 
Enzyme-substrate engineering 

fortuitously been ^^^^^^'^^ J,,,^ s^r^^^^^-s. ot 

g^^i^ ^^^^^^^^^^^^^ff^^ substitution of one 
increase in affinity for K'.V ^ ^^^^^^ ^^^^ 
amino acid for another ^j^^ p^^tein 

profoundly alter ^^^J^^^ affecting the 
other than substrate binding, ^^^^^^^^ 

tertiary structure of ^^J'''° „f the surface 

.IcXle-cell "eta chains causes 

residue to form fibers through self binding 

deoxyhaemoglobm-S to .^^iary and quaternary. 

structure of the haemoglobin are n 
35 WISH75^_JiISS7i) • 
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Changing a single ai.ino acid in BPTI greatly 
„aucer"e Lnaing to trypsin, .ut so.^ o^^-e ne 

1^ rli^at: t/elasta.^^.^^- 
Changes of single a«no ^^^^^^cTitT^ftini., for 

o. the — rratirn: 

™'.yrrnrZrioir....c.ta.oliehing 
binding activity. 



10 



The recently developed techniques of "reverse 
oenetfcs" have heen used to produce single specific 
Tt^oL at precise hase pair loci <™^;^ 

. ~ca:: Tiotr or:s:-typi 

seouencing and m some cases c^r ^ 4.« 

,0 fTction.' ^ese procedures allow J" 
analyse the function of /^^^ ^ 

/MTTTSS^ or of each base paxr m a reguxa-t y 
™e «). in these analyses, the non. has 
:::: to Live for the classical goal of ohta.n.ng 
25 mutants carrying a single alteration (MS087) . 

reverse genetics is often applied to coding 
regions to deter:nine which residues are most ..portant 
to protein structure and function, isolation of a 
30 single mutant at each residue of the protein gives an 
i^tial estl..ate of which residues play crucial roles. 

prior to the method of the present invention, two 
general approaches have heen developed to --^"^ 
35 Ltant proteins through reverse genetics. 
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.nh- dubbed "protein surgery" (DILL87) , a specific 
approach dubbed p ^ ^^^^^.^ ^^^^^^^ 

substitution - and function of 

:;er:r:rtit:t7ons (.b^^-^u^); 

^.,lrable prot.in alterations require 
However, .any thus are not 

all possible amino acid sudst^x 
residue. 

Tbe other approach has been randomly to generate a 

ro«:rr::tror:e ohan. a. .te ... . ^ 

^^^n fPAKU86) This approach is limitea 
sequencing. (^^^ examined. Also, it does 

number of colonies that p.^tein 
not take advantage of any knowieag 
structure and its relationship to binding activity. 

progress toward rules governing ^-^^^^^^f 
. L ruTME83) has been greatly hampered by the 

a^ino -^^^ ^^^^J ,,ing either method and 

extensive efforts involve colonies 
the practical limitations on the number 
that can be inspected (ROBE86) . 

The term "saturation mutagenesis" with reference 
. „«^=,nv taken to mean generation 

'yn"'-*^^ ™^/=j;J\"UTpo=sible sin,le-base 
o( a population m which a) every P 

35 and Cloning of highly degenerate oligonucleotides and 
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promoter se,jaence ^„ ^tudy ?enetic 

similar methods „ot say ho„ to: a) 

e^cpression o£ proteins, mt they ^^^^^ 
Choose protein residues to v^, - 
„tants with desirable propert.es. 

sauer have used 

Keidhaar-Olson ana simultaneously 
synthetic ^^^enerate o^^o-n s ^^^^ ^^^^^ ^ 

«o or three -s^^-^ ^ ,r from bacteriophage 

the dlmer Interface ot ? ^i^its on bow 

lambda. They give -"/^'^^^^^ „or do they 

.any residues could -'"^^^ of 
mention the problem =f ^^^^ tor 

encoding different . ^i„„ization or that 

proteins that either h^w. type ^^^^^^^ ^^^^ 

did not dimerx«. They ^^^^ 
novel binding properties and oi 



20 



25 



30 



35 



rrt ^eTe; 3:--^- - re 

rather than ^-^^^^^^'^J^ ^,^^3 ^DT in 55% ethanol 
made a polypeptide that ^^^^^^^ reported 

(-SE83) . recently -^^^ f .esi^ed 2. 

genetic expression in ^ — " ^^^,,^3 the DDT- 

residue DDT-bindxng protexn and ^^^^^ ^^^^ ^^^.^ 

binding sequence to LacZ- ^^^.^ impossible, 

biologically active proteins 

al (ERIC86) have designed and 
Erickson §t ' oteins that they have named 

s^thesized ---;:jrt: ,ave beta sheets, .bey 
betabellms. that ^^^^ ,y,,,esis with mxxed 

relTels^^^otXr several hundred analogous 
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of a coluian to recover analogues 
betabellins, ^^^^^^ ^.^get co^apound bound 

.it. hig. l^,^ successive rounds of 

to the colu^. ^,,,eins and purification by 

.ixed synthesxs of varxant p ^^^^^^^ ^^^^,,,3 

specific bxndxng. J^^^ Because protexns 
should be -^°-^/°^J^;,3,,rchers.ustse^^^^^ 
cannot be amplxf^ed, the ^^^^itutions improve 
recovered protein to learn whx ^^^^^ 
binding. The -searchers -t 1 ^^^^ 
, diversity so that each varxety .^^^^^^^ 
present in sufficient quantxty 
fraction to be sequenced. 

, t.rou,n «>"^^«^^\\^[/;J^co™on pro.Ie-s: a, non- 
appli^d to anxmal ^Minity 

supports, and t>) 
affinity matrices (BOHS85) . 

.erenci and coUa^orat^rs --^^^^i::;^:/:::!^^ 
Of papers on t.e c--^---:, of ^ 
t.e Csoc, ^S.a. ™B3.., 

(«Mm79, !^ 'Fm36., rmsec, F^KZ87a, 

,3 .K«S3, CI«NS4 The papers report that 

jEEE87b, HEINS7, and ™™» ' jajs genetic 

spontaneous — grapHy over a coiu» 

locus can be xsolated oy ^altodextrins , or 

supporting J^fr^'at TtHer applications 

30 starch. The reports =P-« only the 

3„ possihXe hu^ - rs ^sponsihie for the 

r::^ o;^the --r «rnoU=. - 

3. Tatfe^trret rending to an. target. 
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3o.. .B«BS.a ^^^^^^^ Z. Z 
ai«i=umes °^™;tan^their physiological 
metabolize chemicals and " ^^^f 

behavior durin, the chromato,raph.c expert 



5 



can be 
If 
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, ..a^ent o* a >'ete«lo,ous ,ene c - 

„ced into aL, U, 

- inserted ,e^ P--- - ^ ^.e^.d 

«presslon of the -"^^ ' p,„tein. The 

aomain to appear th g ^^^^^^^ ^„ 

resulting strain ot hy the 

antibody eluted at pH 2.2 and 

heterologous The Ph g ^^^^^^ ^^^^ 

retained some "'^^^"^^J insertion of the heterologous 
„ gene HI was used for i„ „ere 

all copies of gene j-j- 
gene so that all P resultant phage was 

affected; infectivxty of the 
reduced 25-fold. 

3,,,, .resented « t 

Cloned genes using antibcd.es to t^^^-^J' 

made no mention of ""^^^^J parties in the 
material or ct inducing novel binding pr 
inserted protein domain. 

of the repeat region of the 

" '"'":e protein from ,^^9^ 1^^^^^ 

of s-faoe"^rliras an Insert in 

.een ^^^^/f "'^^^^^^.ss, . The recombinant phage 
, the gene III protein i in rabbits. The 

were both antigenic and — ""/^/^^ ^n^^rted 
authors do not suggest mutagenesis of the 

material • 
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antigens have .sen ^used " :^,3eI7o»ins of 

the fusion is in a region coding J ^^^^^ 
^, the HBV antigen. ^P^ear^onJ^^ ^ ^ 

^ Zerrrjrngi-eered 

r=;"-^-- ------ 

10 A nnA f.Q2 "Computer Based 

I^dner, VS Patent No. ^„/oi,piaying 
_L.x. ^ -FnT- Determining ana uj. ^ 
syste. and «-^°^ J"^^^"; converting Douhle- or 
possible Cheittoal Structur single-Chain 
Hultiple-Chain ' .^^hod for converting 

,5 polypeptides" ^-;»;- ; into proteins 

proteins composed of tw essentially the 

of fewer polypept.de chaxns ,,,iegated 
same 3D structure. Itiere ^^^^ ^.^^^ 

DHX and no genetic -^-^^"^^ the specific 

„ WO88/01649 (Publ. March ^"■^^s of linl^er peptides 
.pplication - -pute-ed 
to the preparation of singi 

, ■ V =„d Bird WO88/06630 (puhl. ^ Sept. 

I^dner, GUcK -^^^^ ^^^^^^ ,i„,ie chain 

2S . , t^eened for hindin, to a 

antibody domains may be encoding the 

particular antigen by -rying the^ °^ ^^^^^ 

cabining determining r^9- ,,„e of 

antibody, subcloning th.^o^g ^^^^ ^^^^^^^^ ^„ 
30 phage lambda so that a ^^^^^,,„, pi,„e 

tbe outer surface ^^J^^^^^^^ . affinity 



Je oni; an;:i e„ mentioned is bovine 
Chromatography. The Y ^^^^^^ ^„,,,,ies, targets, 
^cwth hormone. Ho oth ^^^^ p„teins are 
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^s^..^. Nor is there a.y mention of the method or 
degree of mutagenesis. 

are not displayed on the outer surx 

1,0 admission is made that any cited reference is 
No admxssion x ^^^^^ 

^^in-r art or pertinent prior art, and tne q y 
prxor art or p reference and may not be 

are those appearing on the rerer 
identical to the actual publication date. 

STOfflftRY OF THE INVENTION 

1,4-== -fo the construction, 
20 This invention relates to the 

• »nri selection of mutated genes that specify 

C/;r:e^r:V— refer.^^^^^^^^ 
,3 -targets-. »av he, hut ^^ftr iynth" L 

macromolecules as weii 
molecules . 
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The novel binding protein. Bay he ohtalned: 1) by 
.utatm, a gene encoding a Known binding protein w th^ 
«>e suhse^ence encoding a ^o^ ---J^ 1 flrsl 
hy talcing such a suhsequenoe of the gene f 
protein and =o.bining it with all or part of a gene for 
a second protein (which Bay or .ay not be itself 
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Mna.n, P.O...,. I 

a protein vsloh, while not possessing 
activity, possesses a secondary or higher 
^at .ends itself to Mnding activity <=lef«, gr 

^> . - ^' \h slsXence 1=no« to 

hindin, protein but not in^e s ^ ^^^^ 

,au=e the ^nding^ J^, jteT not have any specific 
binding protein is aeriveu 
affinity for the target material. 

in one embodiment, the invention relates to: 

a) preparing a variegated population of -P^;^-^^^^^^ 
,Letic pacKages, each package including nucle- 
acid construct coding on expression for an outer 

t di.Blaved potential binding protein 
surf ace-dispiayea „^.i„r, -t-h© 

H^ a structural signal directing the 
comprising (i) a structuia ^ 

disolay of the protein on the outer surface of the 
f L (ii) a potential binding domain for 
package and J^^^ JJ ^^^^^ , p,,,ality of 

:i:frnt;rent:n^^^^^^ --^^ ----^ 

by the individual packages, 

causing the expression of said P"-^^-^/^^ 
display Of said protein on the outer surface 

such packages, 

c) contacting the paoXages with target -^-i^^ " 
l: the potential binding domains - -e Prote n 
the target material may interact, 
ar:ttg pacWs hearing ^ P-^^ 
aoLin that succeeds in binding the target 
material from pacKages that do not so bind, 
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recovering and replicating at least one paCage 
bearing a successful binding domain, 

• . „ the amino acid sequence of the 
(e) determining ^^/^^^ ^^^^^^ package 

successful binding domain of a g 
which bound to the target material, 

. „ a new variegated population of 
preparing a new ^^^^.^^ ^^^^ ^ 

replicable genetic packag 

tl.e parental potential ^-^^^ ^..^.^ges 
potential ^^^L Xtg dlmat whose se^ence 

heing a — ^^^^ ^^^^ ^ repeating steps 

was determined m step ^e, , ^ 

^,,.,e) with said new ^^^^^ll^'i. desired 
package bearing a binding domain 

^ 4.«-,.ici-ics is obtained, 
binding characteristics is 

. crene encoding the desired 

^^^'ntaT: from t:e genetic package and 
hinding doma- fr ^ ^^^^^^.^^ 

placing it into a expressed as a 

CThe binding domain may then ^ ^^^^^^ 

unitary protein, or as 

protein) . 

,.^iates to a method of 
lnv«,tlcn further "^^^^^ 
=> mixed population of repii'-a« 
preparing ^ ^^^f^/ „e includes a gene 

packages m whicn « . ^. such a manner 

Uessing a potential .ina.n, prot^^n ^ s ^^^^^^^ 

that the protein is presented on the 
the paolcage. This method oomprises: 

„ preparing a variegated P^lation of 

_^ each of which comprises a. 

inserts of eacn o^ potential 
se^ence vhich codes on 
binding domain and, a 
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the encoded protein be 
.i,nal f "=-"^/tlr 3-,ace ot a chosen 
displayed on the ouiiei: 
replicable genetic package, and 

or..tina the resulting population of DNA 
ii) .eplicable genetic 

constructs into the replicable 
packages to produce a population 

genetic packages. 

^^™«nt the potential-binding- 
-r , T^T-^>f erred embodiment, ^ne t> 

in a preferre incorporated into a gene 

protein-encoding J^^^^,^ the replicable 

encoding an outer-surface protein 

genetic package. 

..e invention encompasses ^ - s^t.esis 

regions, saia p^- predetermined 
toward obtaining a protein that binds 

20 target. 

.0. «>e pu^oses o. ^^^^Jl^ 

'•^.r:/ ™r « in a%opuXation o. 
by one specxes of Dm m ^^^ation appears 

- --^^^^^^tTorrserers encoain, one o. .ore 
in one or more ^ v„„-iT,rt the potential of 

^= «-F the polypeptide having tne p 
segments of tne p ^ target substance, 

serving as a binding domain for the targ 

. • time it may be helpful to speak of 

'^^ tTeguence-Zthe variegated ONA. When the 
the "parent sequence °^ ^naloctue of a known 

.in.in, domain sought - "^f^, ,„e 
binding domain, the P^^" variegated 

T:n::^rrr rrirp:^:::-se.ence at most 

35 DNA Will be laen^x^cx 
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potential binding domain l ^ sequence wSich 

principles, the parent sequen predicted 
Lodes the »ino .eld -^^^^^ ?Jf variegated 
, to .orm the desired ^^^^J^^'^^J. .re related 

Ttitt^rrra lU^U o. ..ence slmUarit. 

- the invention is one 
Tbe fundamental P'^i''"^^%°' „ the forced 

,0 of 1^ ^^./ed hy c-eful choice of 

evolution Is greatly -^^'f^'^^^^ structure of 
which residues are to be vari • determinant in 

the potential binding domain is a y 

TTi-rst a set: ot ^ 
this choice. Firsr ^^^^^^^ of the target is 

15 simultaneously contact one ^^^^^^ ^^^^^.^^ 

identified ^-;^/-j:/::i„sly to produce a 
these residues are variea variegated 
variegated population of ' ^^^^^ that a " 

population of D«A is ^^J^^^^^^^^^^ produced. 
20 variegated population of genetic 

, ^.non of genetic packages 
errinJ -IbTe binding proteins is 
containing genes -c^"^ P ^^^^ .3 

enriched for paclcages = ^ (..successful 

- proteins that - « T more' rounds of such 

binding domains ) . « ^^^^^ ^^^^^ 

enrichment, one or more ^^^^^^^ 

examined and --<^2 selected daughter genes of 

variation are chosen. Th ^^gaences for the 

,0 one generation then b^om^ - par ^^^^^ 

next 9-=-*^°" ^,les are continued until a 

..variegation cycle. sucn . , is obtained, 

protein with the desired target affinity 
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hereby incorporated by 
The appended claxBS ^^^^^^ enumeration of 
reference into this specification as 
the preferred embodiments. 



5 BRIEF DESCRIPTION OF THE DRAWINGS 
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. . , is a schematic .howln, tne relationships 
Figure 1 is a .i„ai„q Domains (BD) . 

Setween various types of Binding 

. , flow ohart showing the major steps used 
rr^ea^ ^ nr protein with affinity for a pre- 
determined target. 

u of a PBD contacting a molecule 

Figure 3 is a schematic of a i^r, 

of target material. 

* ^\.c. construction of pLG3 
Figure 4 is a schematic of the const 

from M13mpl8 and pBR322. 

ngure . is a schematic of the construction of pl.7 
from pLG3 and synthetic DN*. 
BEIMI^D DESOaFfXOH OT THE IKTBHTIOH 

25 ,11 n 1 ■ """^tew. 

- present -rde^rie'T^rg 

"""^ Tom cTosei; -Xated genes that specify 
properties from properties, by. 

30 proteins with » ^ ,f each stated gene he 

lp"::n router'surface of^ r^puc^^^^ 
p3c.age that co^t^ns «>e gene,^and ^^^^^^^^ 
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binding to that target laaterial. 

Let Kd (x,y) be a dissociation constant, 

5 [X] [y] 

KD(x,y) = - 



tx:y] 
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.o. t.e purposes ot the appended clai-s, a protein 

P is a binding protein if 

n ,i„-r ionic or atomic species A, 
(ly for one molecular, ionic o 

the dissociation constant (P.A) 
< 10-6 moles/liter, and 

(2) for a different molecular, °^ 
atomic species B, (P,B) > 10 
moles/ liter. 

1^ of these two conditions, the protein P 
AS a result of these ^^^^^^ 

exhibits specificity for A over B, 

of affinity (or avidity) for A. 

. „f a protein is primarily 

^en a domain of a P specifically 

• v.n« -Fn-r the protein's ability f 
responsible for the pr f^^red to herein as a 

bind a chosen target, it appearance of a 

/nn^ We engineer TJie a-ff 
"binding domain" (BD) ' " ^ ..initial potential 

,f7,rB;;Tr ::.aro. r .net. 

, binding domain" (CTBD) concerned with the 

pacKa,e. »e present — » ^ ^^^^^^ „,„,e„ual 
^.ession o. numerous, d^erse^^^^^^ ^ 
binding domains" (PBD) , ^^^.^^ 
potential binding domain' (PfBD selection 

:5 domain of a Xno»n binding protein, and with 
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. «,e genes encoding the most 

aBpliticatxon of the g ^^^^^ « 

^ccesstul mutant PBDs. » J .^lection-through- 

first round of ,„,.oessful binding 

binding isolates ""^ °^ " of variegation 
aomains" (SBD) . » °f ^^,,,„ « ho the PPBD 
„d eeleotion-through-h.nd.ng 

for the next round. ^e ^.^^^ ^^^hod 
limited to proteins with a 

- ^^'f " " i^ux^^eously. relationships of 

..^.ntially or ^-^^ n^e 1. 
the various BDs are 1 

. DTO" refers to a population 

ine term "f,,, ^ase sequence through 

of molecules that have ">^^^ , umited number 

, most of their length "'f.J^f^.o^^. A molecule of 

of defined loci, a plasmid so that 

variegated BK. can be introduce ^^^^^^ 

It constitutes part of a * ^3,riegated Dm are 

.HEIDS8). *en pl-m.ds contain, g^^^ ^^^^ ^ 

,0 used to transform colony of bacteria may 

„ J the original protein. other colony. « 

produce a di«erent ^«";;/7:/ooncentrated at loci 
L variegations of the ar ^ ^^^^^ 

,:no«n to be on the surface o« the p ^^^^^^^^ 
a population of ^--^V^^^, ^ughly the same 3. 
,«*ers of vhich ,he specitic binding 

structure as the '".^.^.r. may be different 

properties of ^^^^^ ^l^^s to sort out ^e 

from each other member. « ^^^^^^ ^.^ desirable 

30 colonies ^^r those that do not e:<hibit the 

binding properties ir 
desired affinities. 

•s a single chain 
^ ..single-chain antibody^ 1 ^^^^^ ^^^^ 

33 polypeptide comprising at least 
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antigen-binding regions 
aBino acids forcing Jf^^^, ,,,,.3 the two 
connected by a peptide ^-^^^^^ .itber 
regions to fold together to bxnd ^^.^^^ 
two antigen-binding regions ^^^^ ^^^^ ^^^^ 
domains of antibodies, spatially 

i^to a beta barrel of ^^^^^f ,,,, strands of 

.elated in tbe domains, and (2) 

known 

antibody varxa«x« , variable domains 

fit together in the same --^^^ ^^^^^ this will 

Of said ^own an^od^- Oen-^^^ ^^^^ 

.e.^ire that, ^^^^^.^ J^^^.e region, there .s at 
corresponding to "^be hyp variable 
least 88% homology with the amin 
domain of a toown antibody. 

i. not limited to: a) -«^^ J „at.rial, O 

„ batch elution fro. an af x ^Y^^^^^^ ^^^^^^^^ , 

batch elution from an '■'""^ J ^^^^ ^^^ing, and e, 

plate, a, — Tr sle o. target .aterial^ . ^ 
electrophoresis in the p ^ 
"Mfinlty material is us ified, called the 

affinity for the material to of the 

..analyte" . m is reversible so that 

affinity — i^^^^^^^^::%rom 1 affinity material 

„ column ---rr rtritrr;:: 

a. affinity matrix ^^^^^^^^^^l Zr, similar and 
batch elution £«» ^/^^^^^.^ ,„der "affinity 
hereinafter vill >=e 
chromatography." 



25 
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an affinity material that xs Current 

la.elea wit. ^ J^^^^^^^^^ .e^ire SCO to 1000 

coiomercially available cell ^^^^ ^^^^ ^^^^ 

molecules of fluorescent dye, s ^.^^es/sec 
to each cell. FACS can sort 10 

. affinity separation involves 
Electrophoretic affi y^^^^ presence of 

electrophoresis of viruses ^ _ . ^ ^^^g^t 
target material, the virus particles 

niaterial changes the net c g bacteriophages 
or cells. It has heen us^Ojo s 
on the basis of charge. (SERV787) 

, e use of affinity 
present Invent- ^Xcterlal viruses ,=r 
separation of bacterial cells, ^^^^^ ^ population £=r 

""^"^ Ltror"" carrvin, ,enes t.at ccae .or 

r^eins :^ — - — 

the words "select" and 
Xn the present inventxon, the 

...election" are used — ^^^^^^ \ pbenotypic 

^ t^ri^rCted to enrich a population for those 

of the present invention comprises 
The process of tne p 

three major parts: 



15 



20 



30 



35 



. .nd production of a replicable 
aesign and pro ^ ^^^^ 

ir^^^-^ denoted C.1PB0. 
separation process that P 
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that bind to a Icnown affinity molecule from 
wild-type GPS or GP(IPBD-)s, neither of whxch 
binds the known affinity molecule, and 

III. design and implementation of a genetic 
variegation method, denoted structure- 
directed mutagenesis, wherein a population of 
106 or more different GP(PBD)s, denoted 
GP(vgPBD), is produced. 

*.-„r, ■>«; called a "separation cycle"; 

"variegation cycle ^^^^ 

cnn and chosen rargeT^ a-^^ 
of binding between an SBD ana 

achieved. 

p»rt I is » strain construction in which we deal 
„ith a s.n,le IPBD se^ence. Varlal^illty may he 
I ^to s„nces adjacent to t^e^ 

^„ ana within the ^ ^te si 

IPBD Will appear on the GP surface, a 

. an antlhcav. having .l.^- ZIT^^: Z 

::rceTs::een:xriestr als^la, o. on .e 

surface, h) ^. ^^^^^ ^^^^ ^ 

the GP surface. In one preferrea eau, 
the process involves: 

1, Choosing a GP such as a bacterial cell (Sec. 
1.1.1) , bacterial spore (1.2.1) , or phage (1.3.1) , 
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.avin, a suitable outer surface protein (Sees. 
1.1.3, 1.2.3, and 1-3.3), 

2) choosing a stable IPBD (Sec. 2) , 

^™-;r.o acid sequence that: a) 

3) designing an ammo acid s q 

TTJRn as a subsequence and b) will 
includes the IPBD as s q ^^^^^ 

cause the IPBD to appear on the GP surfa 
1.1.2, 1.2.2, 1.3.2, and 4), 

4) engineering a gene, denoted 2.2^, 

cldes for the designed animo acid se<iaence b 

Trlides the necessar. ^^^r "^e^t:^ 

introduces convenient sites J ^^^^^ 

manipulation (Sees. 4.1, 4.2, 4.3, 

^r^hr^ Qsne into the GP (Sec. 

5) cloning the osp-ipb d gene 

6.1), and 

6) harvesting the transformed GPs (Sec. 7, and 

tistin, the„ °' °" 1 an 

r«face (Se=. S. this test is P-^^-^^^^^^^^;" 

affinity molecule having hi,h aff.nxty for IPBD, 

denoted AfM(IPBD) . 
xn another preferred embodiment. Part 1 of the process 
involves: 

1) and 2) as above 

3, designing a DNX sequence that: a) encodes the 
3) aesi^iixi-, c-ontains suitable 

IPBD as a subsequence and b) '=°"^"'' 
restriction sites so that random ™^ -^/^ 
operably linked to the iEb4 gane fragment, and c) 
Tovides the necessary genetic regulations ; this 
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DNA sequence is called a "display pr*e", (Sees. 
1.1.4, 1.2.4, 1.3.4 and 4), 

4) constructing that display probe, 

5, Cloning the display probe into and amplifying 
it in a suitable host into the OCT, 

6, Cloning randon or P"-^™ ""^ °' 
the restriction sites provided in the display 
probe, (See. 6.2), whereby the random or 
pseudorandom DHA actions as a potential 2SE, and 

„ harvesting GPs (Sec. 7, screening colonies of 
L transformed »s for presence of IPBD on ^e GP 
surface, this screening is ^^'^^^^f^^ 
MiJ^i^y molecule having high affinity for IPBD, 
denoted AfH(IPBD) , (Sec. 8) ; or, alternatively; 

S, selecting GPS that display IPBD by use of an 
affinity separation using ifH(IPBD) , (Sec. 8) . 

once a GP(IPBD) is produced, it can be used many 
.imes°": the -rting -int ^ ^ 

rgit/"^"^e tllX "f eng^eer the 

targets. e„rface of a GP can be 

=.r,T^oaTance of one IPBD on the surface oi 
appearance oi gP(IPBD)s that display 

used to design and produce other Gi'v^-^^ > 

different IPBDs. 

jathough part I deals with only a single IPBD 
,any preparations are made for Part III where we 
i:^rodL numerous mutations into the potential binding 
dLin. Keferences to PBD or m in Part I are to 
indicate a preparatory intent. 
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part ZX »e opU.i.e =epa^tio„ o-P<^-^ 

IPBD £or ft£M(IPBD) and establish « ^ preferred 

affinity separation process. 
e^odiBent, Part II of the process 
invention involves: 
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15 



20 



affinity coluums bearing AfM(IPBD) at 

1) P^«P^^^"^""'"7AfM(IPBD)/(voluBe of matrix), 
various densities of AfM(IPBD)/^v 

(Sec. 10.1) I 

a, preparing GP(IPBD,s with various amounts of 

IPBD per GP, 

3, picUn, a gradient regime for elutin, the 
columns (Sec. 10.1), 

„ determining »hioh co:.hination of: a, IPB./.P, 
AfM(IPBD)/ (volume of support), c) 
b) density of ^^^^J^^^^^ ^^.^^ ,^^3, and e) 
initial ionic strength, d) elut 

^ «f GPW (volume of support) loadea, gx 
(amount of GP)/(vox ^^^^ ^gP (Sec. 

the best separation of GP(IPBD) 



10.1) , 



25 . =™allest amount of GP(IPBD) 

5) determining the ^^^^^^^^^^ ^^^^^t of 

that can be isolated from a much larger 
wtGP using the optimal condition, (Sec. 10.2), 

3, .et^rmining the efficiency of the affinity 

separation procedure (See. 10.3). 

part II optimizes separation of a single type of 
35 The optimum conditions win 
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tt. Part III, the user must speciiy 
specific pHs. J^f '^^^^^3^ should bind the 
5 conditions under whxch the differ 
Tf the conditions of mtenaea 
target. If tne cox „hich affinity 

.ar.edly fro. -^ ^^^u^r I^^^^^^^^^ 

separation was ^^^^^^^^^^^^ .or conditions 

II and optxmxze the affinity P 
10 similar to the conditions of intend 
selected SBD. 

a fa-raet material and a 
tt. Part III, we choose a targeii 

variety =£ PBDs. We -e ^^^'^ I enrich the 
aevelopaa by the method °* f ;,„,,y p^Ds with 

,„ popuiaticn Of .P(v,PBD,s for CPs th^t d.s^lay 

hinding properties -^^'^'^ ^° ^rPPBD. SBD 
superior to the binding properties J 
...ected from one -iegation ^cle hecomes ^.^^^ 
the ne^ variegation cyoXe^ - ^J-f- ^^^^^^^ 
25 part III of the process of the p 
involves : 

1) piclcing a target moleoule (Sec. 11) , 
30 2) picking a GP(IPBD) (Sec. 12), 

„ piCin, a set of several residues in the PPBC 
to vary hased on a) the 3D structure of the IPBD 
h) sequences of homologous P"'^""^' J' 
comput^ or theoretical modeling that indicates 
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4-oiPi-ate different amino acids 
slBnltanaously based on the n 

variants and which ^^^^^^^^ .^^^.i.^, 
detection capabilitxes of the aff mi y 

(Sec. 13.2)? 



5) implementing the variegation by: 



35 



'"T or aU "f the .as.= encoain, residues 
rate, .or variation, thereby =-tin, a 
:o:u\rtion o. « .oie=uies. denotea v,™x 
(Sec. 13.3), 

ligating this vgDKA, by ^-"-^^^ff^ 
the operative cloning vector (OCV) 
a plasmid or bacteriophage) (Sec. 14.1), 

• fh. liaated DNA to transform cells, 
c) usxng the ^^-^ated transformed 
thereby producing a population 
cells (Sec. 14.2), 

„ ^.iturin, "t^/Lrr^istr;, 

population o. ,,p.xation 

the population of GP(PBD)s, 

being denoted as GP(vgPBD) , (Sec. 14.3), 

enrich^, the ^^^^^^ 
the target by using the affinity 
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n «^ Part II, 'With the chosen 
process developed m Part x±, 

-. a ^-FinitV molecule (Sec. ia> , 

target molecule as affinity « 

repeating steps lH.S.d and ^X^-^- ^^^^ ^ 
GP(SBD) having improved binding to the target 
is isolated {Sec. 15) , and 
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. -1-0,1 SBD or SBDs for 
g) testing the isolated SBD o 

^ f=i citv for the chosen target: 

affinity and specificity 

(Sec. 15.8), 

TTT 7 ITT 4, and III -5 until 
6) repeating steps 111-3, HI-*' 

. ^ o-F hindina is obtained, 
the desired degree of binding i 

. Ccsen target -"^f ^^^^^ p.evicusly- 

J^^ conaiticns o. us. 

the conditions of previous optimizations. 
car^. 0.? - ahhreviations;. 

Xhe foXioving abbreviations will be used 
throughout the present invention: 



T^^^T^-r-ovi ation 



30 



GP 



Maaninq 

Genetic Package, 
bacteriophage 



Any protein 



35 



The gene for protein 
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IPBD 



PBD 



SBD 



PPBD 



OSP 



OSP-PBD 



OSTS 



GP(X) 



GP(X) 



initial potential Binding 
Domain, SiSi 

potential Binding Domain, ^ 
a derivative of BPTI 

successful Binding Domain, 
a derivative of BPTI 
selected for binding to a 
target 

parental Potential Binding 
Domain, an IPBD or an SBD 

from a previous selection 

outer surface Protein, §^ 
coat protein of a phage or 
LamB from Ij. SSli 

Fusion of an OSP and a PBD, 
order of fusion not specified 

outer surface Transport Signal 

A genetic package containing 
the X gene 

A genetic package that 
displays X on its outer 
surface 

An affinity matrix supporting 
"Q«, e^ (T4 lysozyme) is T4 
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&fM(W) 



XIHDUCE 



OCV 
% 

DoAMoM 

OMP 
nt 

Serr 
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lysozyme attached to an 
affinity matrix 

A molecule having affinity for 
"W", trypsin is an 

AfM(BPTI) 

A chemical that can induce 
expression of a gene, e^ 
IPTG for the lasevS promoter 

operative Cloning Vector 

K^, [T][SBD]/[T:SBD] (T is a 
target) 

Kn - tN][SBD]/[N:SBD] (N is a 
non-target) 

Density of AfM(W) on affinity 
matrix 

Abundance of DNA molecules 
encoding amino acid x 

outer membrane protein 
nucleotide 

A bimolecular dissociation 
constant, = [A][B]/[A:B] 

Error level in synthesizing 
vgDNA 
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cor'.. 0.3J 

limited to a single 
The present invention is no ^^.^^^ („ts) 

.ethoa o. .etermin-. ^^^l^,^, agarose gel 
in DNA sul^sequences ^el electrophoresxs 

■ i, not limited to a single 
B,e present " sequences, and reference 

.etnod o. ^^-f "^^ jTo T^rrinln, t.e amino acid 
i„ the appended claims to ^^^^^ 

sequence of a „f methods, whether 

practical method or ^^thod, in most 

dixect or indirect. ^"^^J the DTO that 

cases, is to deter»rne «^ ..la 
encodes the ''^"^aard methods of protein- 

sequence. In seme cases s ^^^^^ 
sequence determination may 
translational processing. 

^„ the process of maXing and 
The major steps m th P ^^^i^ity for a 

isolating a novel ,,^.e 2. 

chosen target material are 



25 



30 




35 



■ A that the aP on which selection- 
» is emphasized „„st he capahle. 

through-binding wiH 
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part of the ^o«th, the x 

approxiBately ' that e=*ibits the desired 

component of a population example, one 

hindin, properties may -J--;^tnt'of the population 
10« or less, once components, it must 

U separated from ^f-^^^tl^, J.^^hle cells is 

the most J_ ,,„,tlo messages can also he 

• nv be a vegetative bacterial cell, 
' " Tsp'^": « a hUerill virus. X strain 
tf^r"":- or Virus is potentially useful .f 
the strain can be: 



20 
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30 



35 



1) maintained in culture, 

„ affinity separated and retain its viahility, 
3, genetically altered with reasonahle facility, 
and 

^•or,i;,v the potential binding 
4) manipulated to display the P 

T^rotein domain where xt can mterac 
Cget material during affinity separation. 

Mna the IPBD sequence may be operably 
DNA encoding the IP ^^^^^ ^^^^^^ 

Unxed to - enco-ng ^^^^l^^^^^ ,os.) 
transport sxgnal of an _ ^ ^. ^^^^^^^ 

native to the CP so that the I ^^^^^^^^ 
outer surface of the GP. 



PCr/US89/03731 

WO 90/02809 ^ 

. genetic p.c,«,e «^Xa. - ---^O^n 

,,3 ou^. s^^ace^ ^ ^^^^^^^^^ - 
viability of the GP or ^^^^^ ^^^^ 

™ -OTH86, SMITB5, and 

boundaries (BECKbj, ujw* 

cf. R0SS81, HOLL83). 
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Thce Characteristics ^cf protein ^that^ are 
recognized by a ^^"^"^^^^^''^^"dlsplayed on the 
rrrrtiU rtfrl^ »o.ter-s..ace transport 

signals" . 

carries ^^^^^jrVellction-through-hinain, 

^ '^"see sec 14 is referred to hereinafter as the 
process, see Sec. 14, ^3 ^ 

operative clonin, vector (0«)^ 

phage, it may also serve suability 
choice of a GP is dependent xn part on the 
of a suitable OCV and suitable OSP. 

. hiv the GP is readily stored, for e==a»ple, 
preferably, the GP i ^^^^^ ^ 

by freezing. If the ^^^^^ 

short doubling tiBe, such as 20 40 ^.^^ 

■54- chould be prolific, e.g., 
is a virus, it should dp finiclcy 

- at least -/-^-^-^^f^JoU. .be OP should 
or e=q>ensxve to -f^'^^J^ centrifugation. The 

te easy to harvest "^"^f J ^.^^^ range of -70 

.P is preferably ^tab ^for a t»p^^^ ^^^^ ^^^^^ . 

-^.-tra/Urs found — - ^ 

. - ri::;' - - - 
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«vr.i, as 4M urea or 2M guanidinium 
Triton, chaotropes such f''' ^^^^^ 

HCX, co^on ions ^1 l™, and 

organic solvents ^ ^^^^^ , ,.i.a.le 

degradative enzymes. Finally, ^ 

5 OCV (see Sec. 3) . 

..efera^ly, the 3 B structure of the OS., the 
4= OSP aene p. 47 are loiown. If the 3U 
sequence of the 2§£ ^ r,referably knowledge 

structure is not known, there xs prefera y 
of Which residues are exposed on the ^^^J''^'^^' ^""^ 
7 tion of the domain boundaries wxthm the OSP, 
I/or Of successful fusions of the OSP and a foreign 
Tnter^ The OSP preferably appears in numerous copxes 
insert. Tne u preferably serves a 

on the outer surface of the GP, and p 
.5 non-essential function. It is desi 

not be post translationally processed, or at 
this processing be understood. 

The preferred GP, OCV and OSP are those for which 
.0 the fewest serious obstacles can be seen rather than 
Z one that scores highest on any one criterion. 

Next we consider general answers to the questions 
r Zls step for the cases of: a) vegetatively 
posed m this step for bacterial spores 

2S growing bacterial cells (Sec 1.1), ^^^^ 
(See. 1.2), and c) (Sec. 1.3). Preferre 
several GPs are given in Table 2. 

Sec._i.ll.^acte£^^ 

one may choose any ^^'^^ 

strain which / , tt know enough about 

questions in this case are. a} a 

TechanisBS that localise protein, on the ou^.de of the 
.5 cell, b, will the I.BD fold in the envxror-ent of 
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outer membrane, and c) will cells change expression of 
osn-pbd . derived from osp-ipb4, during affinity 
separation? Some IPBDs may need large or insoluble 
prosthetic groups, such as an Fe4S4 cluster, that are 
available within the cell, but not in the medium. The 
formation of Fe4S4 clusters found in some ferrodoxins 
is catalyzed by enzymes found in the cell (B0N085) . 
IPBDS that require such prosthetic groups may fail to 
fold or function if displayed on bacterial cells. 



10 



15 



spr.. 1.1.1^ P-referr '^H Racterj?^T Cf-Ms as GP : 

in view of the extensive knowledge of coli, a 
strain of coU, defective in recombination, is the 
strongest candidate as a bacterial GP. Other preferred 
candidates are s^iTnonella typhimurium. Bacillus 
subtilis, and Pseudomonas aeruginosa . 

^ Prefo-TT-Arl Oute ^ .g^T-face Proteins for 

20 nis pTavinq TPBDs o p Bacterial Cells:. 

Gram-negative bacteria have outer-membrane 
proteins (OMP) , that form a subset of OSPs. Many OMPs 
span the membrane one or more times. The signals that 
cause OMPS to localize in the outer membrane are 
encoded in the amino acid sequence of the mature 
protein. Fusions of fragments of 2S!E genes with 
fragments of an s gene have led to X appearing on the 
outer membrane (BENS84, CLEMS 1) . If no fusion data are 
available, then we fuse an ipbd fragment to various 
fragments of the ose gene and obtain GPs that display 
the nso-ipbd fusion on the cell outer surface by 
screening or selection for the display-of-IPBD 
phenotype . 
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Oliver Has reviewed mechanisms of protein 
secretion in .acteria (OX.VSS and OX^S,, . Ki^.ao an^ 

y,^,r^ -reviewed mechanisms by wnicn 
Vaara (NIKA87) have reviewea m of 

, li^^A -hrt -the outer membrane oi 
«-K.rt-t-e»ins become localized to T:ne uuii, 
, Tal": ative .acteria. - example, the La^ prote- 
of E. i= synthesized with a typ.cal s.gnal 

fe^'^^ce Which is subsegaently removed. Benson 
\Zsm showed that l^-^cz fusion P--- -"^^^^ 
1 -^-ori in the outer membrane of 1^ SSU 
,0 rrd^i-: oTthe ma^e X^B protein are i^cXudea 
TZ fusion, but that residues i-43 are insuff.cxent. 

XamB Of I. U - P-i" "^"7 

maltcae^ctrin transport, and serves as the r^'^"^ 
,s adsorption of bacteriophages lambda and KIO. This 
X^IZ. been purified to homo,eneity <™>.78) and 
to function as a trimer • 

-r-:::r;rar 

20 CLEM81, CLEM83, GEHR87) . 

Topological models have been developed that 
describe the function of phage receptor and 
describe describe these 

maltode^^in ^ „3pect to the 

TZ. TJZJZ^^ <c--. 

HEIN88) . 

is transported to the outer membrane if a 
30 functional H-terminal se<,uence is present, further, the 
tost 49 amino acids of the mature seguence are 
reeled for successful transport (b™s84) . Homology 
::«een parts of X^B protein and other outer — 
proteins OmpC. OmpF and fhoE has been detected 
35 (»IKA34, , including homology between LamB amino acxds 
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„f the other proteins. These 
and p,,,eins for transport to 

subsequences may label P antibodies 

the outer BeB^rane. ^^^^^ ' ..^^^ LeooB, have 

aerived fro. mice ^--"^^J^^'J^^^ topological and 
used to c^racter- f ur - 

functional regions, two 
maltose transport (GABA82) . 




15 



20 



25 



localised accordxn, to „ «^o86) 

,„.ein «>e ^ H^e. L a.ino acid wMc. 

Tnat IS, If cytoplasm, then PhcA appears 

normally .s ound - .^^.^.a after an amino 

in the cytoplasm. If ^ „r, then the 

acid n-^"|/f„f^, :ron'LeViPXasmi= side of the 
PhoA domain is looaliz ^.^ collea^es 

.emhrane, and and>ored in l^- ^^.^^^ 

,B.CK3S, have .or inte^al 

gene that can he insert ^^^^^^^ ^„ 

membrane proteins su* that J ^^^^^^^ 

either the cytoplasm or T;ne p 
Where the laSS 9ene was inserted. 

„,„*,ins need not fill a 
OSP-IPBD fusion proteins s„m-neqative 
1. ir, the outer membranes of Gram ney 
structural role in membranes are not 

bacteria because ^^^^^.s there is lively to be 

,i,hly °^^l^llJ°l Jion ^ can be truncated and 

zi:r^ - --i,^ rf^ 

Will display "-^:"^\:r.Ta:rbeen Shown to 
, between fragments of ose ana _ 
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the cell surface, we can design an 
display X on the cell DNA 

^ ,ene oMP-IPBD fusion is 

sec^ence. "^'^ J^^r^^^^^^^ of the .est ^ 
preferably sought by ^^^^ f ^nd testing the 

.o an iEfed, --P--^^^^^^! ^^^^^^^ p.enotype. We use 
.esultant^- tta ^1 to ^iC the point or 

the available <^^^^ to maximize the 

points of fusion between ^ Alternatively, 
livelihood that IPBD will be ^^^'^ ^ that 
, ,e truncate ^ at ^f^^^^ fuse the 

produces ^ fragments ^^j;;^^^^^,^^ fusion are 
osE fragments to i£bd; ^^^^^^^ ^^^^^^ the cell 
screened or -^^^^^^^^trn include short 

surface. An ^^^^^^^^^ ^^^.TLion of ^ fragments 
5 segments of random DNA m the resulting 

to iB^ and Jf\^^/^,3mbers exhibiting the 
variegated population for 
display-of-IPBD phenotype. 

subject to regulation by -J^^^ promoter) . 

as isopropyl t^^^^^^^^^^^ gene; any 

It need not come J .^ed (M^I82) • 
regulatable bacterial promoter can 

aenetic packaging system employing 
once a genetic P designed, it is 

vegetative bacterial cells has been 
time to choose an IPBD (Sec. 2) . 



30 
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alternative to choosing a natural OSP and an 
AS an ^1^^^^^"^ „^ ^an construct a gene 

insertion site m the OSP, laclJVS) , b) 

comprising: a) a regulatable promoter (e_a- - 
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a periplaSBic transport 
a Shine-Dalgamo sequence, J ^^^^ ^^^h a 

signal sequence, d) a fus.on of_t^ ^ ^ ^^^^^^^^^ 
segment of randoxa ^^^/^^^.^..iptional terminator, 

e) a stop codon, and f) a comprises 90-300 

rando. DNA, whic. P^^^^^^^^^ K.IS87) 

.ases, encode — jtrranao^ D.. could in 
fusion of 2^ and th ^^^^ preferred. 

either order, .ut ^^^^^^J,,,., in this way can 
isolates from the population g.n a 
be screened for display of the ^^^^ ^^^^^^ 

version of ---"^'^ :p s-^^^ ^^^^ 
CPS that display IPBD on the oSTS. 

Ltain a DK. J^^ro, ^ps .ay he screened 

Alternatively, clonal isolates 
.or tne display-of-IPBD phenotype. 

r^-F -hlie random DNA 

froB --"^"^Ha uset I. Part iH. 
successful OKIPBD) J"^ ''^^^^ ^ rs,ion of the 
introduce numerous ^^^uae gratuitous 

,ene, soBe of ""-'^^f then 

gratuitous stop ^^,,=;;rface. If E6i 

protein appearing on the = ^^^^ ^„,j„„^ in ESS 

random ^^-^ZtToS.-^^" Prot^ir. appearing on 
Bight lead to ^■"^"P"*!^, _ -„teins often are non- 

="\r:::-.s:irJs aispiavin, — te 

TsireasUV^^oved from the population. 

30 
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Bacterial spores have l^^^^^'j:::^'^:^^^ 
„„didates. --"s on their s--ce. 

metabolize nor alter the P 
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„ ™ore resistant than vegetative 
However, spores are . ore « ^ ^^^^^^ 

bacterial cells « ^..avantage that the 
agents. spores have ,,tio„ are less 

,ole«lar -^^-^-^J^^^ formation oJ M13 or the 
' :::r:t:rorin:o^«.eo.ter.e.raneo..^. 

jreferred^Ba gt '-^T^^ S p ores^ 

. ^ the genus Baciliii^ form endospores 
Bacteria of the genu ^^^^^ 

are extremely ^^-^^^^ (reviewed 
radiation, ^-iccat^cn ^ ^^^^^ ^^^^^^ ^^^^ 

by Losick efe (L0SI85) ) species-specific 

rervTtins^::: — - use ^ spores 
as genetic packages. 

DNA is commonly included in spores. 
Plasmid DNA xs c ^^^^ observed on the 

Plasmid encoded proteins ha sporulation 
surface of lasUli^ 't'^rtLlt is Lderately 

well understood (^^^^^^j ^,ding sequences 

sporulation promoters are ^ expressed only 

operatively linked to such promoters 
during sporulation (BAYC87) . 

^ .1 have identified several polypeptide 

~of B ^ ^PO- 
components of ^^^^ ^^^^^.^^ ammo- 

, sequences of tvo P ^^^^ ^^^^ determined. 

terminal fragments of two synthesized in the 

+-he spore are s>yA**^" 
some components of ^ ^^^^.^^i^le spore proteins 

forespore, S^i^ ^"^-^-^ svnthesized in the 

(^X«,, while other components "^JJ" 
5 mother cell and appear in the spore 
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p„tei„s,. This organisation o. synthesis is 

controllsd at the transcriptional level. 

spores sel£-asse-=le, but the signals that cause 
spores !p«xx ^4 4=4=oT-pnt parts of the 

various proteins to localise xn a.££er-t p 

. wall understood! presunably, tne sig 
spore are not well una proteins from the 

controlling deposition of the coat p 
^oplas» Of the .other cell onto the spore c 
i^edded in the polypeptide — ^;^/:nr3Cursors 

and are then pro^ ^nnrjoavi Viable spores 

aeposition in - -to^Tird-t;pe are produced 

that differ only slightly tr proteins is 

:;:r,thTor:lcing a^nts are needed to^ solubiUse 
several of the proteins of the coat,. The 1 
protein, Cot., contains 5 =V=te3.es^ -tO 
contains an — ^ig. nu^er - histi^_^_ 
,„d prolines (7). ^ „^ methionine. CotC 

contains only =ne ^"'^ ,1,^ „ lysines 

has a very unusual ammo-acid sequen 

,K, appearing as , ^-K dipeptides and one is ^ ^ 
There are also 20 tyrosines m of which PP 
Y-Y dipeptldes. peptides rich in Y and K 
LcoJcrosslinXed in -—^ rias 

«„.S3, W^T..,. ,hejr„o., X. 

that nearly equals the 19 Ks. m „3ia,er cotC 

, K P 0 S, or W amino acids in CotC. Neitne 
L iot; i p st-translationally cleaved. The proteins 
TtA Ind cot^ are post-translationally cleaved. 

Bndospores fro. the genus - .ore =tahle 

are .oospores fro. J^™^- 

gjj^tilis forms spores in 4 to 6 hours, o 
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.pedes «y retire aay. o. ^ '^^^/t' J: 

TO imowledqe and manipulation xs 
aMit.cn, '--f than for other epore- 

„ore aeve opea .or ^ _ p^^^errea 

tormm bacteria. Th ^ - ^^^^ 

, over £^ endospores, but 

,.i.,.<-ridlu» also form very a ..omenient 

^ .... hoina strict anaerobes, are not conv« 
Clostridia, being strio of BasiUM is 

to culture. The choice of a =P-"- °^ ITHTnin, 
gcvemea by XnowW -tttion can be 

" "truer rparculTr -ai: is chosen by the 
controlled. A pair^ vegetative 

criteria listed ^ "-J",;^ ,J .porulation 

,l..he.i=al P;-™- - „ ^,nt not be 

beains so max: yj-^ 



begins 
15 available • 



Hi 1,1 i-rm-T — - Bacterial_§EoresL 

If a spore is chosen as GP, the promoter is the 
,0 If a spor ^^^^^^^ ti,^ 

most important part ot T:ne _h active; a) 

suore coat protein is most active, a; 
promoter of a spore c P synthesized and 

When spore coat "^;';\^,^pecif ic place 

deposited onto the spore and b) m tn p 

. thit spore -/^ttore^-JT^-'-are^s;: 
siibtilis, some of the spore codu ^ 
fictionally processed - t , 

^luable to -™ - --f " ITd Tncorporatin, the 
coat proteins so tbat «e ^ 

30 recognition -^--j/.^^^; ^ ^ se^ence of a 

Tt" e co^ p'tein contains information that 
::res ^e protem to be deposited in the spore^ca^; 
thus gene fusions that include some or all of a mature 
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r,-^o-Fprred for screening or 
nrotein sequence are preferrea 
coat protein q ^„_of .ipBD phenotype. 

selection for the display oi ii-c f 

ipbd fragments to cgtc or cotD 

Fusions of 3. on the 
^-^o likely to cause IPBD t:o 

fragments are likeiy preferred 

.pore surface. The genes ^ and^^^^ ^^^^^ 
^ genes because cotc ^ ^^^^ ^ or 

translationally cleaved Sub Jju ^^^^^^ 

^ could also be used to cause a ^^^^ 

- ---^ " ^"Siifag^of proteins into 
post-translatxonal ^^^^^^^^ f^^ed to a 

— Of tt." ^reitTer end of the coding 
:r orat^ites -erior to the ^^^o. 
spores could then be screened or 
display-of-IPBD phenotype. 

. . .0 Bacillus sporulation promoter has been 
TO date, no Bs£3JJ^ P exogenous chemical inducer 
Shown to be inducible by an exogen ^^^^^^^^^^^^ 

- - 1^ ;-;;-:J^^,tu^ from a sporulation 
quantity of J^^Z by other factors, such as 

promoter can be line-Dalgarno sequence or 

the DNA sequence around the Shine Daig 

codon usage. 




25 



.iterations governing insertion site in the 
The considerations g ^^^^.^^ ^ 3, 

30 spore OSP are the same as those gi 

Genes 
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.«nou,. tne c=ns«e«tions =p=«s are n 
..cterlax cells '^^^ ^-^^^^tVoteins to appear on 

becomes a more attractive option. 

„e can use the approach described above at 1^.4 

-rn ^^iT coll, except that:. 
£or attaching an IPBD to an ^ ceU, P 

a) a sporulation promoter iS used, a 
;Ulasmic Signal segaence should be present. 




UnliXe bacterial cells and spores^ f^'^^^Ji 

Phage ^-^/ri^ ^ X^^L i. 

of an OSP and how .it xm:erciv.v,c» 

„,td The size of the phage genome and the 
20 the capsid. The si i^^rtant because the 

packaging mechanism J ^ 

ZrlTll ^serted into the phage genome, 



t pbd gene 
therefore 



25 



30 



1^ the virion must be capable of accepting the 
iLer:ion or substitution of genetic .aterxal, an. 

2) the genome of the phage must be small enough to 
allow convenient manipulation. 

Additional considerations in "choosing phage are: 1) 
Aaaii.xuiia.j. nhaae determxnes the 

the morphogenetic pathway of the phage 
the morpn g ^. ^ „ tpbd will have opportunity to 
environment in which the IPBD wixi n 
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^^n 3^ IPBDs needing large 
told «lthxn a cell, 3) ^^^^^ ^^^^ 

prosthetic ^-^^ -^^^^;tnr»a ^) "Oen variegation is 
prosthetic ^7^=/^f ''^',;ipi, infections coald 
introchiced in Part III, "it P ^^^^ 
generate hyhrid OPS that can:y ^ ^^^^ 
have at least some copies of a dll 
^faces, it is preferahle to »ini..ze this 

possibility. 

excellent candidates for GPs 
Bacteriophages are excel ^^^^^^^^ 

because phage, and because the 

associated with ^^^^^^.i ^ost, rendering 

genes are inactive ^^.^iiy inert. The 

.ature phage P^f ^;^^=J,^,,,,nage PhiXX74 are of 
filamentous phage M13 ana oa 
particular interest. 

.ne entire life cycle of -^Jl^^^jTZ 

-3, a co-cn ^^o^^x^^::Tz::tX^^^^ - 

nnderstood. ^'^ " f,,,^ relevant to both 

consider the ^^'^^f^^^^^^^^,,, i, for historical 
(RiSC36); any ^^«^^^^^, (^.^ collets sequence 
accuracy. Ihe genetic ' ten genes, 

,SCH.7S, , the location of the 

and the order of ^""^^^^^ physical 
promoters) of H13 is ^^^^^ ^^^3„_ ^^79, 

structure ot the ^i I ^^^^^ ^^3, 

»SS7., OHKASi. ■ J-'X ""'^^ 

and Z1HM82); see ""^^^ ^^^^ ^^^^ins. 
structure and function of the coat p 

35 
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relevant faCs *cut H13 aisdcsed in Example 

I. 

5 ^tfvnd is a very small 

The baoteriopbage PhiX174 is ^ 

h»B been thoroughly studied by 
icosahedral virus which ^J^^ microscopy (See 

genetics, biochemistry, and 

,0 proteins '^^^ ^^'^ ^ ^ as a cloning vector 

diffraction. PhiX174 n additional DNAJ 

hecause PhiX174 can accept almost no 
«e virus is so tightly -^^3/; ,hat 

,enes overlap. ^^/^y ^ild-type 2 gane 

- : Hlfsmt; sTU L host supines this 

protein. 

.rrre=r^trr<-s!:::tz;:: 

spiKs protean, J^ copx P^^^^^ H comprises amino 

°Ts ^e protein interacts with the single- 
acids. Tbe F p proteins F, G, and H 
25 stranded Wf. of the virus, i P infected 
are translated from a single mml& in the vl 

• cells. 
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T.a-r qp. TOTA Phages 

TA have much larger 
Phage such as lambda or T4 have m 
\ Lkn do M13 or PhiX174. Large genomes are less 
genomes than do ^13 genomes. A phage 

conveniently manipulated than sma g ^^^^.^ 
With a large genome, however, could be used g 
manipulation is sufficiently convenient. Phage 
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and T4 have more complicated 3D capsid 
lambda and T4 h ..^^ ,ore OSPs to 

structures than M13 virions 

*,.o™ Phaae lambda virions and phage i'* 
choose from. Phage requiring large or 

form intracellularly, so 3^rfaces 
insoluble prosthetic groups might ^^1^^ ^ 

V- Phaae lambda and phage 

of these phage. ^^^J ^^,,e phages could 

r,T-ef erred, however, derivaT;ives, 
Z to overcome these aisadvantages . 



10 PMTi Phages 

---;.ro;u^r'jrjrorrt 

because manipulation of BNA competent 
^an is the manipulation of DN.- ^^^^^^ ^^^^^^^^^^^ 
15 RNA bacteriophage are ^ot prefer ^^^^ 
altered BNA-containing particles could 
BNA phage, such as MS2. 

MS2 as a GP, we would need to eliminate 
To use MS2 as a ^ir, ^^n-iobd 
,0 «ost of the natural viral 5-°-;° : 
-F-ii- into the protein capsid. i""^ 
gene could fit into P ically to a 

..at the A prote^ binds ^ ^^^^^^ ^^^^^^^^^^ 

site at the 5 ena o ^ , if coat protein 

formation of «.x-conta^xng P^-^^^^Yno the A protein 

a message containing tne a f 
25 is present. If a mes , ^^^^^ ^^^^ ^^^^^^ 

binding site and the ^,3, contained A 

and a PBD were produced in a "^^^^ ^^^^ f„„ 

. protein and wild-type ^^'^ ^°^J^fZ. coding tor 

regulated ^--J^^^ ^U- ^ — 

30 the chimeric protein »oul g ^^^^ 
comprising ENA encapsulated by protei 

- satisfies the " --"X, ^ItMng 'on the 
:::rdrtrprr:icr:; Te-selves are not viable. 
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we 



would need to: 

1) separate the BNA from the protein cap.id, 

reverse transcrihe the BK. into ON., using ^ 
or MMTV reverse transcriptase, and 

. i-he DNA by several cycles of polyiaerase 
3) amplify the DNA fiy 5 

farx>\ until there j-s ' 
:r.cn" ei::U .n.. »e..e. .n.. a 
;ral. for se^^cin, a„a «rt.er worK. 

15 isolated phage. 

III 'Inn i'*'"°es=. 

a alven bacteriophage, the preterred OSP is 
a given ^^^^^^ ^ 

usually one that is pre ^^^^ greatest 

largest nmnber of -P"^' /^^^.^ osP-IPBB to wild 

nexihilitv in -^7^^::^ West UKelihcod of 

"Lractoraffinity -paration. Moreover, 
25 obtaining satisfactory ai usually 

. --renir--- ^n^rphogenesis or 

performs an ^^^^'^^ ^ ^ein by addition or 
i..ection, .utati^ m reduction in viability 

insertion is likely 



30 of the GP« 
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^ the wild-type osfi gene be 

It is preferred that the wxi yp inserted 
The ipbd gene fragment may be 
preserved. The iee_ v. recipient ose gene or 

either into a second copy of the recip 
into a novel engineered 2gE gene. The P 
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. the GP is M13 i= the ,ene HI P«tain (see 
for use when the GP is nx 

Example 1) . 

5 « site in the candidate OSP 

The user must choose a si ^^^^^ 

,ene for inserting a ^ ^^^^^^^^ ,,,, in 

^ost bacteriophage are 9 ^^^^^^.^ ^p^^^^, 
bacteriophage, unlxke the cases ^^^^^^^^ 
,0 it is important to retain ^g^.^pB, fusion 

of the parental^ OSP ^J^l^^,^ of the i.^ 

proteins. A P-^-^^^j;";„, i3 one in which: a) the 
gene into the phage osp g ^^^^i^s 
XPBD folds into its original shape ) ^^^^^ 
fold into their original shapes and 
15 fold in^ ^^tween the two domains, 
interference between i^xi 

A 1 the phage that indicates 
Xf there is a 3D ^^^^^J^^^^^^^ of an OSP is 

that either the ammo or ^^^,^^s of that 

,0 exposed to ^J^^.r ^^^^^^ for insertion of 

mature OSP ^--^/^^^/^.^tion 3D model suffices, 
the ipbd gene. A low reso± 

„f a 3D structure, the amino and 

in the absence of a 3D ^^^^ 

. • «-P the mature OSP at« 

25 carboxy termini of the ^ 

candidates for J^./^ddi-STnal residues 

functional fusion ,,oid unwanted 

between the IPBD and OSP Random-sequence DNA 

interactions ^^^r segu^^ of a protein 

30 or DNA -^-^ 7^;^/^, OSP, can be inserted between 

homologous to if needed. 

the osE fragment and the ie__ 

•4-1, -in -t-he OSP is also 
.ueion at a .»ain .ounaa« "'^-J'^^^^, ,,3,,„. 
,^v, for obtaining a xun^w 
35 a good approach for 
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. -^.d such a boundary when suhcloning 
smith exploited such _ 
heterologous DNA into gene HI of f ^ ^ 

^ Tn^thods of identifying domains, 
^e.. ar. ^--^^ -^^^^ coordinate, ^ave been 
' rLran: C.:r ,.»X3S, see also BOSKSS. 

,„ structural information available is 

,0 the amino acxd sequenc^ ^^^^^ ,^ , 

tne sequence to predict tnr ^^^^ 

.i,. p-''^^"^^^*!:^^:; irjd /as«n, .chou,.,,, 

-"^''^^^.f^fir/Uf candidates .or insertion o. 
these locations are aiso 

15 the iEb^ gene fragment. 

- .temativel. ---^1^ ^ — t 

aetermined functional strain by 

constructions and selecting osP-IPBD Bust 

„ic ----- ,:,e coat, it is 

^^r X p-tlcular random se^ence 
" ^Tlto l /ane wUl produce a fusion protein 

coupled to the iEE_ 9 ^ lunctional way. 

that fits into the CO l^rga 
nevertheless . random ^"^^^^^ ^ ,ene will 

f^^ents Of a coat P-«-^7^^; ^ c^ain one or 
30 produce a population that ^ 1:^1^ ^^^^^ ^ 

„ore members that display the IPBO on 

--Vris ^rr^tr^d^nd^ se^ences 
cloned into appropriate sites. 
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<=^«T« Tiai-Tirallv occurring 
An IPBD may be chosen from naturally 
p.ote"s or ac^ins naturally occurr^, prote . r 

protein may nave ^^-^^^^ J.^T ,) the designed 
.> the rr,; distribution o* 

n-rotein is smaller, ana cj ^ ^ ,„ 

prorexn x ^ . v^A snecified more freely, 

the designed protein can be specirie 

A candidate IPBD must meet the following criteria: 
stablility under the conditions of -J;— 

sequence is obtainable, 3) " 
^e^dues on the outer surfaoe. and thexr epatxal 
resDj-t-iu ii^K4ii-(-v of a molecule, 

relationships, and 4) ^'^^^/^^'^^^ „bD. 
AfM(IPBD) having high specific affinity for 

preferably, the IPBD is no larger than necessary 
Vt is easier to arrange restriction sites in 
because it is easier usefulness of 

smaller amino-acid se^ences. The 
^««riidate IPBDs that meet all of these rey 
:::lnt on the availability of the information 

discussed below. 

J -iudae IPBD suitability 

xnformaticn -d^ :udg ^^^^^ ^^^^^^ 

includes: 1 a 3D ^ ^^^^^^^^^ 

r«>e - - "TtriiTtrn 

A^ the stability and solubility 
:n:;e:ture, PH and ionio str^ .-rab..o» 

J. i^a c-hable over a Wiae iremy'^ 

to be stable ° ability to bind metal 

conditions of intended use) , 5) ability 
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^ ^. no p«.e3:ence, ■ -^^^^tl .u. .ay 

,^o»lea,e Preterrea ac..^^^ " 
cause problems) , '> ^^^^ preferred) , 

'^'^'^TJaTT.' rZZl .avU specnc ana 
8) availability ot a ^ ^p^P 

strong affinity ( ^ ^^^^^ 
(preferred) , 9) availability of 
specific ana .eaiu. affinity (J , 
for the IPBD (preferrea), 10) f^^^^ „clecule(s) 
XPB. t.at does not ^^^^ ..ible, 

^ -mr^ipcule having affinity 
Tf only one species of molecuxe a ^ 
XX onxy r T-h will be used to. 

forXPBO (.fMdPBD)) ^^'2TTs:rtZ%> opti.i.e 

„ aetect the 1.B0 on ^e CP 

e:^«ssion level ana density of 

on the Batrix (Sec 10.1) . ^ J separation 
efficiency and sensitivity of the a.f^^^ y^^^^^^_ 

<secs. "-^ -\"-^^;^/\,,,,3ble tvc species of 

r^Brre^r;:^.-.^ 
reats^i-=r.a^^^^^^^^^^^^^^ 

.na sensitivity ^ > '^^^r Jicn ,10.1,. 

moderate affinity would be used m op 

V «t least 20 candidate IPBDs the above 
For at 1-^^^ ^ practical to obtain, for 

information .s ^^^^^^^ ^^i^^i^or (BPTI, 58 

example, bovine pancreatic t^^^^^^ ^^^^ ^^^^^^ 

residues), crambm (46 residues), residues), 
ovomucoid (56 residues), T4 lysozyme (164 resxdu 
and azurin (128 residues) . 

35 
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. A f-rr^m a PPBD according to 

,ne ^„;rXected towara «.e solvent, 

having side groups ^.^^ ^^^^^ 

K^osed residues ^i^ited in this 

acids. t„i=auy nave only 

«,ard (REID8S,. the PBB, but 

=mall effects on ^ ^^^^ the chosen 

reduce the stahUity c. ^^^^^^^^ 

IPBD should have a hign ^ 

.cceptahle. ^^^^ rar^raL , H-O to .0 
„ide PH range (8^0 t^ ^^^^^^ ^^„3,„ 

preferred), ^ "■"^,,,,,i„„.throu,h-hlnding will 
IPBD by mutation and sele ^^^.^^.^ly, the 

retain sufficient ^^ab y- ^^^^^^ 
substitutions in the 1™°.^ ^^^^^ b^iow 50°c. 

not reduce the melting point of the c 

molecule, size an classes to yield 

vtkelv than other cia£,i=.«= 
IPBDs more liKeiy finally to the target. 

-^^-":::3r::e:::yrsrrcre«^^^^^^^^ 

Because these are very y . large positive, 

b) large neutral, c) ^^^^^^ ^ negative, A small 
e) small neutral, and f) small J ^^^.^^ ,,,, 

collection of IPBDs, one °^ ^ .""^ preferred candidate 
Class of target, will contain a preferr 
IPBD for any chosen target. 

Alternatively, the user may elect to engineer 
.P.XPBD) for a particular targets 
criteria that relate target size an 
choice of IPBD. 
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X. the target is a protein cr o^r --.ol^-^^ 

. -L. 4r 4.V10 TPBD is a small pro-cem 
. ,ret.rr^ e*oa..e„t ot IPBD ^^^^^^^^ _ 

-1, UPTT from Bos Taur^^ , js 

r "eed (46 ^sidues, , or th. third domaxn of 

5 fro» rape se^ (4 ^^^^ (j^p,„,,e 

quail) ' es that can accomodate 

this class have clefts ^ j£ the target 

small proteins in highly ^/^-^^^tre, such as 

,0 U a ™olecule lacKin, ^^fJ'^'^Z: a small 
starch, it should be treated as if 1 

, Extended macromoleoules with detine 

molecule. txtenaou »™ated as large 

Structure, such as collagen, should be treated 

molecules • 

Xf the target is a s.all molecule, such as a 
. . ^referred embodiment of the IPBD is a 
steroid, a preferrea ^ taurus (124 

^ • ^-Ize of ribonuclease from ISS £aur_ 

protein the size oi i-i^ .,,„<^ oi-v^ae (104 

^ -ribonuclease from jqp?.rqi?.lua orYSaS ^-^ 
residues) , ribonucieas« callus gallug 

white ivsozyme from gai-LVks as 

20 residues) , hen egg white lys y aeruginosa (128 

residues,. ^ ^^^^^^^f^sfS;;;;;; such 

^sidues, . cr T4 lyso.^e J^^^ ,,,,, 
proteins have clefts and gro^ ^^ein Data 

target molecules can fit. The Br 

^ • « -7n c't-mctures for these proi^exiio* 
25 Bank contains 3D ^^^^^^^^^ y^e can be 

encoding proteins as large as '^'^ J 
manipulated by standard techniques for the purpo 

this invention. 

™,-rioT-al insoluble in water, 
T-F i-he target is a mineral, insoxu" 
30 If the targ mineral's molecular 

one must consider the nature Of crystalline 

surface. Smooth ^-^^^ J;^^: J'^su^ as 
Silicon) require -^^J^ J^f J 3ufficient 

3. and s ^icity. .ough, grooved surfaces 
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<+.v,ar- bv small proteins 
,«olites), could be tK,und -^l-erjy 
'(^PTI, or larger proteins (T4 lysozyme, . 




10 



cMrga can P^^-^-^^/^^'^^^.^j^e, it is preferred 
surfaces froM J':rinte;dsd use, the IPBD 

that, under the condxtions of x ^^^^^^ 

the target '^^^^^:^: ^^^ZJ^o. counter ions 
that one of theB is "^'''^^f ' ' repulsion, 
can reduce or eliminate electrostatic 



15 




20 



25 



30 



TPBD is an enzyme, it nay be 
It the Chosen IPBD is 

necessary to change one or ~- if 
site to inactivate enzyme function. ^^^^^ 
the IPBP were lysozyme and GP w ^^^^ ^^^^ 

- «13, - »<-<'J-f htd,'the"p were PhiXl,., 
the cells. If, °^ needed because 

— - rs:: ^ =^ 

T4 lysozyme can be ov rp ^^^^^^^ ^^^^^ 

without detrimental ef ^ ^^^^^^^^^^ ^^^^ 

intracellularly. It is p ^^^^^ 
IPB.S that might be -"^^ J „ore residues 

substituting mutant amino ^^^^J^^^ o„e or 

of the active site. It P ^^„„.a to abolish the 
.ore Of the — ^t "er^ ^ 
original enzymatic activx y ^^^^^ 

that receive Tr^eZnces will not be 

:,ay die, but the majority of sequenc 

deleterious. 
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10 



less than 10 

OCT is preferably ^-^^ ' ;„„,enesis be 

^. „ U desirable «.at cassette - J 

practical in the OCT, P 

restriction enzymes are '^ailaB ^i^gie-stranded 
ocv. Xt is mewUe ^'^^^^^^^ J, ,„ferably 
„uta,enesis be practical-^-al V. ^^^^^ ^3 
,^ies a selectable BarKer_ ^ „,uable 
obtained or is -^^^^^ ^^^^ ..^r tbe bacterial 
,,,,„3. PlasBids much .ore 

chromosome because ^ chromosomal 

rraXil^^arre .0 be used, the ^ 
, irmustbe inserted into the pha,e ,enome. 

an antibiotic resistance 
.or P.age sue. -J^^^^ ,hX.,30) . Here 

g,,, is engineered ^^J^^^^ ,,^,3 discernable 

virulent pliage, ..^e a resistance 

,0 places that ^^^ ^ l^""^^,^,,^, there is no room xn 
gene is not essential, furt ^ material, 

the PhiXl74 virion to ^^l^^^^^^lJ,,^,^,, gene is a 
Xnahility to include - an^-^-/^^^^ that 
disadvantage because it limits tn 

25 can be screened. 

A ^v,at GP(IPBD) carry a selectable 
Xt is preferred that GP(I ) ^^^^ ^^^^^^^^ 

.ar3cer not ,,t carried by GP(IPBD). 

vtGP carry a selectable maricer no 

30 

_DssiSIliBa.tiia-2S^^ 
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■ «rid sequence that will cause 
we design an a.ino ac.d ^^^^ 

the IPBD to appear on the 
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•.o acid sequence may determine the 
expressed. Thxs f ^^^2^ °^ 
entire codin. ^V"^,,^,,-;^!^ restriction 
contain only the iEfed s qu ^^^^ ^^^^^ ^^^^ ^ 

sites into vhich random DNA will o 

^ «;,v be produced by any means. The 

segment, derived tro aescrlbed in 

easily .--^""!^-::f "e^n" 

10 because they aiJ-ow yx 
restriction sites. 

^•F the osE-iEbd gene, the two 
He,ardln, re^l^t-n of t.e^S^^^_^^^^ 

important ^estio,^ " how accurately BU.t we regelate 

need on each GP, ana 
the amount? 

-I , ^-F-Finitv for the targe-c* -^-^ ^ 
PBOS having low amnity ^^^^ ^^^^^ 

°* -^"^^ wUy-Wnding PBDS will cease to 

conditions, then all weaji y ^^^^^ 
bind before any strongly-bmamg 

that all 

regulation of the ^^"^,7 J e«-=^ ' 

pacKages display -"-"""^^ ,3D/GP had an 

separation in Sec is. ^^^^ ^j^e 

affect on the -tion ^^th ^ ^^^^^^^ 

affinity matrix, then we wo i„, analysis 

amount of PBD/GP accurately. ™ 

.nows that there is no ^^^^^^^^t all aPs are 

„ elution between the PBOs 

35 the same size, b) that 
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the affinity .atri. dominate differ^tial elation 
" CPS, =, that the syste. is at equil^rxuB, ar,d d, 
that all PBDS on any one GP are identical. 

I£ H„ identical PBDs on a GP each have access to 
target .olecules, and each PBD has a ^'^^'^^ 
TZ., to the target of delta G,, then the total free 
energy of binding is 



10 



15 



20 



30 



delta Gbt°^ = Np * delta Gfe • 
Oelta a, is a function of parameters of t.e solvent 

.uc. as: 1) j:::; 

^.^^„„rature, 4) concentration of neutral soxut^ 
temperature, j specific ions, 

-ch— i:r:;er°h^^^ - 

xTconditions are altered ^^^-^'^^ZZ 
4.>,«4- delta Gk approaches zero, delta apF 
that delta Sb ^ot goes to or above 

zero NP times faster. As delta Gfa g . ^ . 

•n /i-iecneiate from the immobiiizea 
zero, the packages will dissociate irom 

target molecules and be eluted. 

OPS bearing more PBDs have a sharper transition 
between bound and unbound than packages with fewer of 
the same PBDs. For equilibrium -^^^^-^ ' ^^/^^^ 
point of the transition is determined only by the 
fottion conditions that bring the individual 
ilte^actions to zero free-energy. The n^er of 
PBDs/GP determines the sharpness of the transition. 

It Should also be noted that the number of PBDs/GP 
is usually influenced by physiological conditions so 
Lt a sLple Of genetically identical ^^i^-^^^^^ 
contain GPs having different numbers of PBDs on «.e GP 
35 surface. In a population of GP(vgPBD)s each PBD 
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. U I' thte 1. . linear effect on elusion v=l.»e 
o iler o* PBDS/CP, tnen the CPs having the great^t 
:Ler of PBD, will he »ost retarded -^7; 
«hen we culture the enriched population the S^*''^'^^' 
. he amplified and give rise to new - <^BD., s hav ng 

^ nnn /av Thus the affinixzy 

.0 varying ^ " /\/\^^"„sent Invention could 
separation ^^H^^^J"":^ PBOe/aP on the 

tolerate a linear e«-^ ^i„,i„g t„ 

^aintion voliiiae of the G¥(^ou} uaij-^ 

t^;:^r fortuitously causes the PBO to he displayed on 
15 the GP only in low number. 

Since there is no linear effect on elution volume 
tro. L nu*er of IPBDs/GP, need for highly accurate 

Of 1PB.0P is not — -j-r::": 

" ::;Ur= t::n rstiriv^: genetic ele.ents 
analysis ahove assumes that CP(IPBD,s are in 
^ilihrlL between solution in buffer and hound to the 
tmnity matrix. Rate of elution May be an l^porta^ 
:::a.etL in column affinity - 
elution from an affinity matrix el«l 
,ffi„ity Plate, the ti^e that e.^ -ffer is in ontact 

„ith the ^"^^^^.^'11^'^,,^^ :,^^'^^.^^^^^ 

variable. The density of afiinity m 
,n ZtriK is an important variable in optimizing the 
30 matrix is an P analysis above is 

affinity separation. Because ^^aiment we 

gualitative, in Sec. 10 ^' '^^Jl^Z ot IPBD on the 
experimentally optimize: 1) the density o 
G^surface, 2) the density of affinity molecules on the 
35 affinity matrix, 3) the initial ionic strength, 4, the 
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.lution rate, and 5, tie quantity of GP/(vcXuBe of 
matrix) to ha loaded on the ooluim. 

Transcriptional regulation of gene expreasicn ia 
5 best understood and »=st effective, so we focus our 
attention on the promoter. A nu^er of promoters are 
^own that can he controlled hy specif xc chem 
added to the culture medium. Por example " 
^Loter is induced if isopropylthlogalactos.de is 
10 !dded to the culture medium, for example, at between 
10 added to „„.4„,*ter we use "XIKDnCE" as a 

1.0 uM and 10.0 m. Hereinafter, we us . 
generic term for a chemical that induces expression of 
Hene. If transcription of the oB^iBti 9ene is 
controlled by XIKDHC, then the number of 0-^^=^^^- 
15 increases for increasing concentrations °* 

until a fall-Off in the nu^er of viable P-^^^es is 
Observed or until sufficient IPBD is observed on the 
surface of harvested SP(IPBD)s. 

The attributes that affect the maximum number of 
OSP-IPBDS per GP are primarily structural in nature. 
^Le may he steric hindrance or other unwanted 
::::ractiL between XPBOs if OSP-XPBO is subs xtuted 
for every wild-type OSP. Excessive levels -^ ^^^^^^ 
25 may also adversely affect the solubility or 

• H.>,o TP For cellular and viral GPs, 
morphogenesis of the GP. For cei for 
as few as five copies of a protein having affinity for 
another Immobilized molecule have resulted in 
successful affinity separations (FE«B82a, FE«E»2b, and 
30 SMIT85) . 

Another consideration of promoter regulation is 
that it is useful later to know the range of regulatxon 
Of the ^.EriEbd. (see. 8) In particular, one should 
35 determine how nearly the absence of XIHDUCE leads to 
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« *-he GP surface; a non-leaky 
absence "^^^ :M,„ess is usem-. a, to 

promoter is P«*«^^- ^p,_,^)s lor AfMIIPBD) is 
3ho« that affinity of "^'^^-^ ,ii„„ growth of 
to th. orxJnCS if the expression 

, SP(s^) i» ° s The IMSffi promoter in 

of ^ 1= .tte r=xr;eprt=o7Tr a preferred 
conjunction with the . Lacli rep 

example. 

,i„n Is not limited to a single 

..e present procedure is an 

.ethod of gene design Jh ^^^^^ ^^^^^ 
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20 



25 



30 



needs of the present invention. 

« the "--tt/Tritil - 

aaflnlte seguence then ^^^^ 
constructed (Sec. 6.1). ^^^tructed first! 

to iEbd. then a "display P~^^ ^ ^..pi^te the 

rando. DHA is ^^ .^^ ^^^^^ e..) from 

population of putative ^^J^ in 

...ch a functional 

vivo selection or Kinareu 

one may use any genetic engineeri^ me-- « 

produce the — =^ '^™J^r ;uta:io„s ' to specific 
easily and accurately direct « 

3ites in the ^ ^"^^^'^Cj .ere , however, the 
methods Of mutagenesis -""^"^i;;;,; aifferent 

-Pr^-^- -t-He oSP-iESfi gene lu^i 

DNA sequence for the -E-^ ^ mature 

any other in ^^^^^^ method of 

c. ;f 3Uhseguences coding for the 

mutagenesis, one ^epla ^^^^^ to he mutagenized 

PBD with vgDNA, then su« 
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10 



15 



20 



within the OCV. If jxng 

directed mutagenesis is ^^^^ 
sequence of the subsequence coding for th 
unique within the OCV. 

.egulato. elements include: a) ^^J, 
Shine-oalgarno sequences, ^^^^^^^ ^^^^ 

terminators, consensus sequences of 

designed from knowledge of 
natural regulatory regions. 

«^ rt<»nes to be synthesized are 

•designed at the protein ^^^^^ ^^^..^ various 

ine amino acid se.pences are ^„,,,e 
,oals, including:, a, -^f'l^;^^^ ,,,,, and c, 

a OP. b, Change of charge o ^^^^^^ 

generation of a VoP'^^^on^ ^^^^^^^ 

^ SBD. The a^ig-ty ^» f and to 

::er rri^d-r^r Of »ino acid, at 

variegated codons. 
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. computer program mav he -^^^ ^i". 
possihie .„™. - ---- jridltif. pxa«= 
acid seg^ence given ^I J^^^^^^,,,,, restriction 
where recognition sites "•'^^ Iitering the amino- 
en^ymee could be provided without altering 

acid sequence. 

. =ni-es are positioned within the osEr 

Restriction sites are po ,,^t„een sites is 

^-h;,t the longest segment between sit; 
if ro^si:!:. Ues the produce cohesive 
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10 



. The codon preferences of the 

^essenser PKA are also considered. 

. es^U^ed ™ .or ,e„e .n^^esis U 

synthesize both strands „„^ieotides (nts) 

overlapping nathod that is more 

,^38). - ^ tn^HA. our method differs 

suitable for «Vn«>e"S of 9 ^„ 
from previous methods «>"J««' ° „„t out the 

a, use two ^"""^ ' J^^.s are: a, to 

«*ended DBA in the mlddl synthesized 
produce longer pxeces of „ to 

as ssDSA on commercial D»A y 

— strands ^^J^ we remove the 

:^:::r:tro:ar:^.-------'-^- 

„ synthesizers can Produce oli,o-nts^ of up « 

"° rrt^ "-;err needed to obta. 

parameters (the leng ^^^^^^ ^^^^^^ 

---"tst^tl^^^^^^^ can cut near the 
needed so that - J^^ determined hy DNA and 

end of blunt-ended dsDNA) a _ reasonable 

enzyme chemistry. = 10 and Ns - 

values . 

• the DKA sequence to be synthesized into 
„e divxde the MA s^ ^^^^^ 

two nearly equal ^„ overlap between 

total t bp CKW, containinq no 

^"d ?ases The overlap preferably, .s not 
variegated bases^ synthesize the 

palindromic and has nign 
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A ^>,^ 5' extension of each strand, 
overlap portion and tne completed with 

Klencw enzyme and all to^J^ t„ be 

.eqa^ce as blunt-ended J' to ten 

, U,ated to ot.e. ™. .^«,etlc deOK. 

:r :r e^^e^tlv w.tn an appropriate 

can then oe 

restriction enzyme (OLIP87) . 

„ is not rigidly fixed at 100, the 
10 Because Mdna " overall and 

current li»it= of X.o "^^ ^„ ^ exceeded 

100 in each fragment are no y 

of 190 and 100 

5 or 10 nts. coing -J^^^f ^^^^^^^^^ acceptable 
will lead to lower yields, but tne 
15 in certain cases. 
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^^^r, is not limited to any 
The present ^^^^truction. 
particular method of DNA synthesis or constru 



- - — o; z :;xrer 

standard means on a MilUgen 

.iXligen 7500 has seven via^^^^^^ ^^^^^ 
phosphoramidites may he taken ^^^^ ^^^^^ ^. 

four contain A, J-r ^ cosine or mixtures 

contain — T.e standard 

of bases, the so-called dir^ ^^^^^ 

software allows pro^a«.ed mixing 

four bases in e^aimolar quantities- 



or 



35 



.be present invention is not ll»ite to_^^an, 
particular ^ /^"r^Lcrop.oresis and 

tllctrriution an XBX device .International 
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CT^ is, preferably, 

.sea to puruy ^'^^J^Z^ l^^^^^^ ^-^"^^ '^'"^ 
corp., Baltimore, HD) are 




10 



15 



the synthetic gene is 
in the preferred »ettcd^ * „,„sfor.ed into 
constructed using ^^^^,3 («,I82, P250) or 

.acterial cells .y ^^f^^^^o^,. uternatively , DKA 
Slightly .odined standard « 
fragments derived t,„„ nature or to 

other fragments of D«A deri^^ ^^^^ 
synthetic D»A involves construction 

Tr:::i::r Pir^^dfrntainin, .rger and larger 



of a sei:i«=» — - 
segments of the complete gene 
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screening are use. ^^^^,1 sites t.a. was 
random DNA into one of the r 
aligned into the display probe. 

■ d in a variety of 
random DBA may possibility- 
„ays. Degenerate synthetic » 

Jtematively, P=-"" ^ lite (GCTG/c) has 
nature. «, tay^e at one end of the 

Seen designed into the display P ^^^^^ 

^ -::/;r ntXs'r-i- variety Of 

partially digest DSA that fragments «ith 

fences, g ^ J- - p„he has 

CATG 3' overhangs. 
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gene so that random DNA can be cloned 

4.v« ri^snlav probe is digested 
, P-sBid car^-^ Z^^^ -V»e and tne 
5 with t.e ^PP"^"^^ 7; ^,,,ed ana U,ated .y 
tra^entad, random »^ ^^^^.^^ ,3,^ « 

standard mathods. I^e 9 ^^^^^^^^ 
transform cells ^^^^ . ' ^^^ance gena. Plasmid- 
e.preasion o. -^^ ^^^^^^'^T^. aiaplay-o.-XPB. 
10 bearing SPs are then salec ^^^^ 
pnenotype by the P"-^"! '^^^ it .era the 

..eaent --.^Tfr isolate .P(P.O,s that 

:r to atrgat Vrom a'Targe population that do not 
15 bind. 

transformed with Ugated OCVs and 
cells are ^-^"^^"^ ^^^^^ ^„ appropriate 

20 selected '^^''^^''^^^^^.^^.o^ri.^^ 

incubation wx«. an agent PP^^^ ^^^^^^^ 

:aarlcers on the OCV ^^^^^ generally, 

appropriate *° J^^ ^esuspension of the 

centrifugatxon to P^^^f ^ j^^ffer (spores or 
25 pellets in sterile medium (cells) 

phage) . 

3„ -^ar.estadpao.gaa.en.ta^«^^^^^ 

XPB. on - -.^r a- o. XPBO or .«.(XPBO, 
be essential for levels. The tests can 

— — - — UngnTen^ymatioally. C 
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The ifflliraoi ^" 

by Affinity p«cipltat-n- ^^^^^^ 
step is one " ^e tPBD molecule and 

(preferably, Ka < 10 > ^^.^ ^^^„ple, it 

little or no affinity anhydrotrypsin, or 

3P« vere tne ^^/^^ ,3 the .fH(Bm> to 

antibodies to BPTI could be ^ya„trypBin, a 

test for the presence of converted to 

trypsin ->^^^trno"^teolytic activity but retains 
aehydroalanme, l-a-^ ^ andH0BE77,. 
its affinity for BPTi l 

of the IPBD on the 
preferably, the P«-"=; the use of a 

surface of the GP is ^-"f^^f^ ,^<„bd, with hi,h 
::rt; ^^.^^^^1^- derivative of ...B. 
is denoted as AfH(IPBD)*. 

v,„„ used, then the procedures 
U rando. DKA has been use , .^^^^^^ 

of sec. 15 are used to obtain ,i„„l 

the display-of-«BD phenot^e Jl 

isolates »ay be screened for t ^^^^.^^ 

^ The tests of this 5^«f 
phenotype_ ^he 

or more of tnese 

• ^ i-o the affinity molecule 
= « - rrrcre- "a:ion as disclosed in 

are obtained we taxe u 

Sec. 9- 

« one or .ore - - r ^^X^t the 

„ xPBD is -i^P^f ^ ^o«n affinity for IPBD is 

binding of molecules havin, ^f 

aue to the *^-f^,^^i.al techni^es, such as: 
Standard genetic ana d 
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<«Kri aene into the parent 
GP to verify that nsp ip - 

*^„-m +->ie isolated GP 
„ .eleti„, ^^ZZ loss o. 
t= verify that loss of fiiEJEM 

binding, 

3, Showing that binding of GPs to AfK^^D)^ 
1" with CXX»»C., (in those -es that 
Xrsssion of 20=il« i= """""^ 
[XINDUCE]), and 

■P rv^ to Afli(IPBD) is 
Showing that ^^^^^^^.f^'/^^^^^^^ not to 

specific to the immobxlized AfM(IP 

the support matrix. 

..enc. Of on the . - 

. strong — "™,:r^ ITZZ. Of X.BO ,suoh 
reactions that are l.ne- in ^^^^^^ ^„,ipbD,., W 

binding of y ^^^.^1 reactions 

relevant ipbd gene fragment from 
we sequence the releva ^ determine the 

each of several clonal isolates 
construction. 

salt concentration and pH 
we establish the maximum salt c ^^^^^^ 

^or which the GPdPBD) binds tn 
range for wnx^-n 

AfM(IPBD) * 
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. T ^ «n the outside of the GP, 
« the IPSO i= intrcuce. 
^4 if that display IS <=1^^^1^ otter«is« »e «st 

sij^r'jrr:;; .p.o..te ...... 



5 



measures , 



^ . J. a. 

^ ^ to fuse an iES4 fragment to a 
If we have attempted to fuse 
natural "agment, our options are = 

„ pic. a different fusion to the same ^ >=y 

using OPP-";;-^:: :^:,„es from ^ in 
Keeping more or fewer ^ 
the fusion! for example, m in 

nr 4 residues, 

J trying a ^own or predicted do.a.n 

boundary, position, 
d) trying a predicted loop o 

2) pick a different ogE, or 

3) switch to randoB DNA method. 

• ^ tried the random DNA method 
,5 If we have DUst tried tn 

unsuccessfully, our options are : 

a different relationship between ipfed 
1) Choose a ^-^^^^^ ,.3,3^, random DNA 

fragment and random DNA {2£S- 

30 second or viSS YSEgS) . 

„ try a different degree of partial^di^estion. a 
different enzyme for partial ' 

source 

different degree of shearing or a dif fe 
-5 Of natural DNA, or 
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15 



3) switch to the natural OSP method. 



Xf all reasonable OSPs of the current GP have been 
..Jan" the random ON. method has been tr.ed, both 
without success, we pick a new GP. 



20 



25 



Part II 



goi^. 10.0; 



^^^.^^■i^-Y ger^nrf^tinn Meansj. 
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T p«rt II we optimize an affinity separation 
in part II we P ^^^^ich a 

system that will be used xn Part I 

^-F rp fvaPBD)s for those Gf^i-cu^s. 

population of ^^^^^^^^^ affinity for the target. 

display PBDs with increased affmiry r 

Affinity Chromatography is the preferred means 
.ut .4, electrophoresis, or other means may also be 
used. 



10.1! 
go-pa ration:. 

Changes in eluant concentration cause GPs to elute 
Changes m however, is more 

from the column. Elutxon vol , ^derstood 
easily measured and specified. It xs t 
1-hat the eluant concentration is the agent ca 
that the eiu concentration can be 

release and that an eiu sijecified 
calculated from an elution volume and the specif i 

gradient- 

vsina a specified elution tegi^-e, we ooBpare the 
e.ution voX-^e^ .P(X.BO,s wit. --^-"^^^ 
on affinity columns supporting ^m^m. 
comparisons are »ade at various; a, amounts of IPBD/OP, 
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b) densities of AfM(IPBD)/ (volume of matrix) (DoAMoM) , 

c) initial ionic strengths, d) elution rates, e) 
amounts of GP/ (volume of support), f) pHs, and g) 
temperatures, because these are the parameters most 

5 likely to affect the sensitivity and efficiency of the 
separation. We then pick those conditions giving the 
best separation, 

we do not optimize pH or temperature; rather we 
10 record optimal values for the other parameters for one 
or more values of pH and temperature. The conditions 
of intended use, specified by the user (Sec. 11) , may 
include a specification of pH or temperature. If pH is 
specified, then pH will not be varied in eluting the 
column (Sec. 15.3). Decreasing pH may be ^^^^ ^^ 
liberate bound GPs from the matrix. If the intended 
use specifies a temperature, we will hold the affinity 
column at the specified temperature during elution, but 
we might vary the temperature during recovery. 

The AFM (IPBD) is preferably one known to have 
moderate- affinity for the IPBD (K^ in the range 10" M 
to 10-8 M). When populations of GP(vgPBD)s are 
fractionated, there will be roughly three 
25 subpopulations: a) those with no binding, b) those that 
have some binding but can be washed off with high salt 
or low pH, and c) those that bind very tightly and must 
be rescued in situ. We optimize the parameters to 
separate (a) from (b) rather than (b) from (c) . Let 
30 PBD„ be a PBD having weak binding to the target and 
PBD. be a PBD having strong binding. Higher DoAMoM 
might, for example, favor retention of GP(PBD,) but 
also make it very difficult to elute viable GP(PBDs) • 
we will optimize the affinity separation to retain 
35 GP(PBD^) rather than to allow release of GP(PBDs) 



PCr/US89/03731 
WO 90/02809 



70 



because a tightly bound GP(PBD3) can be ™^ 
situ growth. If we find that DoAMoM strongly affects 
^elution volume, then in part III - may reduce the 
amount of target on the affinity column when an SBD has 
5 been found with moderately strong affinity on the 

order of lO-"^ M) for the target. 

in this step, we measure elution volumes of 
genetically pure GPs that elute from the affinxty 
,0 Ltrix as Sharp bands that can be detected by 

absorption. Samples from effluent fractxons are plated 
on suitable medium (cells or spores) or on sensxtxve 
cells (phage) and colonies or plaques counted. 

,5 several values of IPBD/GP, DoAMoM, elution rates, 

initial ionic strengths, and loadings should be 
examined. We anticipate that optimal values of IPBD/GP 
and DOAMOM will be correlated and therefore should be 
optimized together. The effects of initial xonxc 
.0 strength, elution rate, and amount of GP/ (matrxx 
volume) are unlilcely to be strongly correlated, and so 
they can be optimized independently. 

For each set of parameters to be tested, the 
25 column is eluted in a specified manner. For example 
we may use a regime called Elution Regime 1: a KCl 
gradient runs from lOmM to maximum allowed for the 
GP(IPBD) viability in 100 fractions of 0.05 Vv (voxd 
volume), followed by 20 fractions of 0.05 Vv at maxxmum 
30 allowed KCl; pH of the buffer is maintained at the 
specified value with a convenient buffer such as Trxs. 
It is important that the conditions of thxs 
optimization be similar to the conditions that are used 
in Part III for selection for binding to target (Sec. 
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«f CVS from the chromatographic 
15.3) and recovery of C3Ps irom 

system (Sec. 15.4) . 

«>e ,ene is reflate. ^ CXXKDUC^l - 

^«r,+-T-oiled by varying [XiNuuur.j. 
, -BO/OP can^^ c n ^ ..entity 

Appropriate values °^ <■ example, XIHDncE 

Of CXIKBOCEI ana the Pr»°*«; ^e pr»oter ie 

" - -:/;:ri:tirr.=, m: .tu an 

le ^^^^a or a„ acceptable , level c. e^cpression 

is obtained. 

~^^rt'ur::r:.trie:rfn 

material can bind to ^* a«iciency 
appropriate steps ^ „ so 

of separation will be a smootn 
►hat it is appropriate to cover a wide range or 
Tr ol- with a coarse .rid and then 
neigbborhood of tbe approximate optimum with a 

grid. 

several values of initial ionic strength are 
teste" such as 1.0 mH. S.O mM. 10.0 mM and 20.0 mM. 

.he elution rate is varied, by successive factors 
„f 1/2 from the maximum attainable rate to 1/16 of 
tLs ::iue. The fastest elution rate giving the good 
30 separation is optimal. 

The 9oal of the optimization is to obtain a sharp 
.ransmo! between bound and unbound '-^■^^ ^ 
increasing salt or decreasing pH °y '^'^^^^''^ 
35 both. This optimization need be performed only, a) 
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each temperature to be used, b) for each pH to be used, 
and c) when a new GP(IPBD) Is created. 

Regulatable promoters are available for all 
genetic packages except, possibly, bacterial spores. X 
promoter functional in bacterial spores ^« 
constructing a hybrid of a ^oru at.on 
proLter and a regulatable bacterial P^""*" ' 
L) or by saturation mutagenesis of a sporulation 
^l;ter followed by screening for regulatable promoter 
Ltlvity (Of. OLIP86, 0I^P87,. «hen the promoter of 
Z ^ is not regulatable we optr.^e 

„o*MoM. the elution rate, and the amount ^^J^^sZt 
of matrix. It the optimized affinity separation xs not 
acceptable, we must develop a means to alter the amount 
of IPBD per GP. 

^y^P ^<.»o^^^ivitv of affinity 
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Qor< ■ 10 . 2 : 
gp- paration; 



«e determine the sensitivity of . the affinity 
separation (C,ensl) -asuring the minimum ^ant.ty 
Of GP(IPBD) that can be detected in the presence of a 
large excess of wtGP. The user chooses a number of 
separation cycles, denoted Hchrom- 
performed before an enrichment is abandoned, 
preferably, N,b„, is in the range 6 to 10 Jchr^ 
must be greater than 4. Enrichment can be terminated 
by isolation of a desired GP(SBD) before Nchrom P««=- 

The measurement of sensitivity is significantly 
expedited It GP(IPBD, and WtGP carry different 
selectable markers. 
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Mixture, of GPdPBD) and wtGP are prepared in the 
ratios of i=vu., where V,i, ranges - 
factor (e^ 1/10) over an appropriate range, typxcally 
loll through 104. Large values of Vii„ are tested 
. iLt, .J a positive result is ^^^^Jl^ 
of Vii„, no smaller values of Vj.i„ neea oe 
: ch'r:;ture is applied to a column -PP™; 
optimal DOMOH, an A£M(IPBD) having h.gh -"^"^^^ 
IPBD and the column is eluted by the spec.f.ed elut.on 
,0 regime The last fraction that contains viahle OPs and 
" r Lculum Of the column matrix material are culture • 
If GPdPBD) and wtGP have different selectable markers 
IL transfer onto selection P-^^ 
colony. otherwise, a nuiober (e^ 32) of 
coxony. presence of IPBD by the 

15 isolates are tested for presence 

techniques discussed in Sec. 8. 

If IPBD is not detected on the surface of any of 
the isolated CPs, then GPs. are pooled from: a) the last 
.0 few 3 to 5) fractions that -n-.n vxable GPs 

and h) an inoculum taken from the column matrix. The 
Tooled GPS are cultured and passed over the same colu^ 
Z enriched for GP(IPBD) in the manner descrxhed^ 
This process is repeated until Nchrom P-ses have been 
^ H ^r- until the IPBD has been detected on the 
25 performed, or until T:ne ir masses 
GPS If GP(IPBD) is not detected after Nchrom P^^^^^' 
Vii^ is decreased and the process is repeated. 

C 3i equals the highest value of V^^ for which 
30 the us^er" can recover GP(IPBD) within Nchrom Passes. 
The number of chromatographic cycles (Keyc) that were 
needed to isolate GP(IPBD) gives a rough estimate of 
Ceff ; Ceff is approximately the Keycth root of Vlim. 

Ceff = (approx.) exp( loge(Viim)/Kcyc ) 
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4 0 X 10^ and tbree 

Ceff = (approx.) 736. 

^^r^iiT-atelv. we determine 
TO determine Ceff more accurately, 

tl.e ratio of GP(IPBD)/wtGP loaded onto an AfM(lPBD) 
the ratio oi «^irv amounts of 

column that yields approximately equal 
GP(IPBD) and wtGP after elution. 

c^^ 10.4; ^^^'^^ -qeparatlon Me^ 

Other separation means are optimized in a manner 
parallel to the used for affinity chromatography. 

. / rr FACStar from Beckton-Dickinson, 

; fluorescent molecule, denoted Mm d^BD) • 

...,a.les tHat „.t .e "Pt^i^ed — a, amount - 
concentration ^^^^'^ ' ^^L,,,,,, 

:":e^.maonlne^^B™ 

=er.--ri.r;:rrtrr^-^^^^^^ 

independently . 
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v,..«sis is most appropriate to 
Electrophoresxs ^.^^ ^^^^^..^ . 

bacteriophage j/^ J^;; ^^.^..^ion means if the 

Electrophoresis xs a preferred s p ^ 

coluim or " por example, chloro«etate 

change the entire target. essentially 
contain only s^en ^,,,„..e«.e 
altered by any . ^han GPs that do 

„uld become more -^^^^^LT^L^s of CPS could be 
not bind the ion and so these o 

separated. 

.tars to optimize for electrophoresis 
The parameters to op xaaterial, 
^ TDun/GP b) concentration wi. y 
include: a) IPBD/gf, /tpbD^ , d) ionic 

- agarose. C — clltg c^^^ty'o. the 
strength, e) size, shape, an currents, 
electrophoresis 'PP^^^''^' ^^.e^wy, IPBD/GP and 

and f) concentration of ^^^^^ 
[Afm(XPBD)] are varied « 
parameters are optimized independently. 

Part III 

5gc_t3^iil_^li2iSS-Sl_aset_ffl3t^^ 

material may be chosen as target material, 
subject only to the following restrictions. 

If affinity chromatography is to be used, then: 

* tarctet material must be of 
1) the molecules of the ^^^^^^ 

. sufficient size and ^^-^^^^l^^^^^ J ,,,i„ity 
applied to a solid support suitable 

separation , 
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2) after application to a matrix, the target 
material mist not react with water, 

3, after application to a matrix, ^^^^l^ 
.eterial mu=t not bind or degrade proteins .n a 
non-specific way, and 

4, the molecules of the target material must be 
sufficiently large that attaching the material to 
a matrix allows enough unaltered surface area 
(generally at least 500 &\ excluding the atom 
that is connected to the linker) for protein 
binding . 

I, FACS is to be used as the affinity separation 
means, then: 

1) the molecules of the target material must be of 
sufficient size and chemical reactxv.ty to he 

, ■ conjugated to a suitable fluorescent dye or the 
target must itself be fluorescent, 

2) after any necessary fluorescent labeling, the 
target must not react with water, 

3) after any necessary fluorescent labeling the 
target material must not bind or degrade proteins 
in a non-specific way, and 

i»o o-F the target material must be 
,0 4) the molecules of the narg . , 

sufficiently large that attaching the ^ " 

a suitable dye allows enough -^^-^^J"' 
area (generally at least 500 X^, excluding the 
atom that is connected to the linker, for protean 

35 binding • 
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+-n be used, then: 
If affinity electrophoresxs xs to 

^ either be charged or of such a 
X) the target BUSt exther ^. ^^^^^^ 

nature that its binding to P 

the charge of the protein, 

;Lteins in a non-specific way, and 

^ .ust be compatible with a suitable 
4) the target must oe 

gel material. 

.o.=i.le target ^ J^^j:: ^3"^," 

U-ltea to: a, soluble ^^^^/Jt^^.^^ivatea (Moc. 

.,c,Xo.ln, hu.an alpha 

cyclase toxin, any retrovir .^^ (^^ch as 

retroviral 333 protease), lycoproteins (such 

n„.an low density ^.f ^7*""\;,cUvsaccharides (such 
as a monoclonal antxbody), d,l^P^ 

0-anti,en =£ "^^^^^f-^ff^^^messenger 

,cias (such as ^^-/J^'ltne J" =P-«^<="^'' « 
asDNA or BSDHA, possibly with s^ cholesterol , 

soluble organic ^"^^'^''^^ morphine, codeine, 
aspartame, bllirUD , benzo(a)pyrene, 
aichlorodiphenyltrichlorethane (DDT), ^^,,„„„cin 

prostaglandin .^^^^^^^l^^^'^s iron haem or 

,) =7;,^:;; ,,„ch as cellulose or 

cobolt haem) , h) organic poller ( 

chitin), 1) i-ol-^l^ minerals J ^^^^^ ^^^^ 

.oolites, or hydroxylapatite) , :) 
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f^»ny. as influenza haemaggutinin or phage 
proteins (such as inriuen „^™v,^ane or outer 

and Ic^ bacterial mentirane oi 
lambda caps id) , and K) ^ coli or 

:aembrane proteins (such as LamB from 

flagella proteins) . 

^ X supply of several milligrams ol pure target 

t rial TLired. Impure target material could be 
material is aesire^ binds to a 

used, but one might obtain a protein that 
contaminant instead of to the target. 

.^e following information about the target 
material is highly desirable: 

1) stability as a function of temperature, pH, and 

^5 .ionic strength; 

2) stability with respect to chaotropes such as 

urea or guanidiniuia Cl, 

20 3) pi. 

4) molecular weight, 

5) requirements for prosthetic groups or ions, 
25 such as haem or Ca+2, and 

6) proteolytic activity, if any. 

Xn addition to this most desirable 1"^°-^^^-'^^^ 
• oo-fnl to know: 1) the target's sequence, if the 
30 IS useful to 3^ structure of the 

target is a macromolecule, 2) tne toxicity, 
target, 3) enzymatic activity, if any, and 4) toxicity, 

if any. 
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of the present invention specifies 
user of the binding 

certain parameters of the in 

protein: 

5 1) the acceptable temperature range, 

2) the acceptable pH range, 

3) the acceptable concentrations of ions and 
3^0 neutral solutes, 

the maximum acceptable dissociation constant 
for the target and the SBD: 
15 [Target] [SBD]/ [Target: SBD] 

. cases the user may require discrimination 
in some cases, i^ne „„„_taraet. Let 

.etween T, the target, and B, — ncn target 

" Kt - eTHSBDl/[T!SBDl , and 

then K,/K«. ([^U«^SBOn/(CNIt^=^Bm). 
u.er then specifies a -eptahZe value for 

the ratio 

Xf the target material is a general protea.e, one 
30 must consider the folloving points: 

.) a highly specific protease can he treated liHe 
any other target. 
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a general protease, such a. ' 
degrade the OSPs of the 0. Including OSP-PBDs, 
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there are several alternative ways of dealing with 
general proteases, including: a) a chemical 
inhibitor may be used to prevent proteolysis (e^ 
phenylmethylfluorosulfate (PMFS) that inhibits 
serine proteases) , b) one or more active-site 
residues may be mutated to create an inactive 
protein (e^ a serine protease in which the 
active serine is mutated to alanine) , or c) one or 
more active-site amino-acids of the protein may be 
chemically modified to destroy the catalytic 
activity (e^ a serine protease in which the 
active serine is converted to anhydroserine) , 

3) SBDs selected for binding to a protease need 
not be inhibitors; SBDs that happen to inhibit 
the protease target are a fairly small subset of 
SBDS that bind to the protease target, 

4) the more we modify the target protease, the 

^v^-Kain an SBD th'at inhibits 
less like we are to obtain an i>i5u 

the target protease, and 

5) if the user requires that the SBD inhibit the 
target protease, then the active site of the 
target protease must not be modified any more than 
necessary; inactivation by mutation or chemical 
modification are preferred methods of inactivation 
and a protein protease inhibitor becomes a prime 
candidate for IPBD. For example, BPTI could be 
mutated, by the methods of the present invention, 
to bind to proteases other than trypsin (TANK77 
and TSCH87) . 
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a^r^ ^7.Qi Ch r^^r.^ of GPfTPBD) ; 

The user must pick a GP(IPBD) that is suitable to 
the chosen target according to the criteria of Sec. 2. 
5 It is anticipated that a small collection of a 
• GP(IPBD)s can be assembled such that, for any chosen 
target, at least one meiaber of the collection will be a 
suitable starting point for engineering a protein that 
binds to the chosen target by the methods of the 
10 present invention. The user should optimize the 
affinity separation for conditions appropriate to the 
intended use by the methods described in Part II. 

n. T^...tifi --^^^^ Family of PBD^ , Related 
15 -hn PPBD, Be g enerated 

-,. rhnoMn q >-^o-i>^^i>-^ on TPBP for other PPBD) . 

tio vary; 

we choose residues in the IPBD to vary through 
consideration of several factors, including: a) the 3D 
structure of the IPBD, b) sequences homologous to IPBD, 
and c) modeling of the IPBD and mutants of the IPBD. 
Because the number of residues that could strongly 
influence binding is always greater than the number 
that can be varied simultaneously, the user must pick a 
subset of those residues to vary at one time. The user 
must also pick trial levels of variegation and 
calculate the abundances of various sequences. The 
list of varied residues and the level of variegation at 
each varied residue are adjusted until the composite 
variegation is commensurate with Cgensi ^^'^ ^ntv 

A key concept is that only structured proteins 
exhibit specific binding, can bind to a particular 
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chemical entity to the exclusion of most others, 
the residues to be varied are chosen with an eye to 
preserving the underlying IPBD structure 
substitutions that prevent the PBD from folding wxll 
5 cause GPs carrying those genes to bind indiscriminately 
so that they can easily be removed from the population. 

Burial of hydrophobic surfaces so that bulk water 
is excluded is one of the strongest forces driving the 
0 binding of proteins to other molecules. Bulk water can 
be excluded from the region between two molecules only 
if the surfaces are complementary. We must test as 
zaany surfaces as possible to find one that is 
complementary to the target. The selection-through- 
L5 binding isolates those proteins that are more nearly 
complementary to some surface on the target. The 
effective diversity of a variegated population is 
measured by the number -of different surfaces, rather 
than the number of protein sequences. Thus we should 
20 maximize the number of surfaces generated m our 
population, rather than the number of protein 
sequences . 

in hypothetical example 1, we consider a 
25 hypothetical PBD, shown in Ti qare 3 binding to a 
hypothetical target. ?iqure 3. is a 2D schematic of 3D 
objects; by hypothesis, residues 1, 2, 4, 6, 7, 13, 14, 
15, 20, 21, 22, 27, 29, 31, 33, 34, 36, 37, 38, and 39 
of the IPBD are on the 3D surface of the IPBD, even 
30 though shown well inside the circle. Proteins do not 
have distinct, countable faces. Therefore we define an 
"interaction set" to be a set of residues such that all 
members of the set can simultaneously touch one 
molecule of the target material without any atom of the 
35 target coming closer than van der Waals distance to any 



wo 90/02809 



PCr/US89/03731 



10 



15 



83 



20 



25 



30 



* +.>,o TPBD The concept of a residue 
.ain-chain atom of the IPBD ^ ^^^^^^^ ^^^^^^ 

..touching" a Bolecule of the target x pigjj^ 
one .yPotnetical -erac-n set, Set^. 
comprises residues 6, 7 , i^ypothetical 
represented by squares. ^"""^ ^ 4, 6, 

interaction set, Set B, comprises residues 1, 2, 4, 
31, 37, and 39, represented by circles. 

Xf we vary one residue, number 21 for example 
If we vo I obtain 20 protein 

through all twenty amno interaction set 

on different surfaces for inT-^j-a 
sequences and 20 differ interaction sets and 

A. Note that residue 6 ^^J"^ ..i^s 
variation of residue 6 through all 20 ^^^^ 
yields 20 versions of interaction set A and 
of interaction set B. 

- co^iaer — — r 

n -t-wdntv ammo . acias, y«" = 
all twenty „ ,raT-ipd were, foi^ 

Tf the two residues variea wej. , 
sequences. J-i i-" there would be 

example, nu^er X and number 21, ^^^-J^o^i^^ set A 
o„ly « different surface, ■'--^^^^^^n^^, 3 does 
does not depend on residue ^ --"-f i,,,,, 
not depend on residue 21 If the 
however, were nuiitoer 7 and number 21, 
would be generated. 

I, » spatially separated residues are var^Ud at 
V N surfaces are generated. Variation 
onetime, ^^'^ ^J^^^ interaction set yields 2oN 
N residues in the same int ^^^iation of 

surfaces. For example if K ^^^.^^.^^ 
separated residues yields 140^^^^^^^ =1.23 x 10^ 

surfaces 
residues 
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of interacting residues yields 20 
surfaces. Thus, to maximize the nu^^er of 
generated when N residues are varied, 
Should be in the same interaction set. 
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The amount of surface area buried in strong 
protein-protein interactions ranges from 1000 g to 
2000 S2 (SCHU79, pl03ff ) . Individual amino acids have 
total surface areas that depend mostly on type of amxno 
acid and weakly on conformation. These areas range 
from about 180 %^ for glycine to about 360 g for 
tryptophan. From amino-acid solvent exposures of 
p^lisLd protein structures, we calculate that lOOOS^ 
on a protein surface comprises between 4 and 30 ammo- 
acid residues. Varied amino acid sequences, as found 
in actual proteins, involve between 10 and 25 residues 
in forming 1000 8^ of protein surface. Schulz and 
Schirmer estimate that 100 of protein surface can 
exhibit as many as 1000 different specific patterns 
(SCHU79, pl05). The number of surface patterns rises 
exponentially with the area that can be varxed 
independently. One of the BPTI structures recorded xn 
the Brooldiaven Protein Data Bank (6PTI) , for example, 
has a total exposed surface area of 3997 (using the 
method of Lee and Richards (LEEB71) and a solvent 
radius of 1.4 g and atomic radii as shown in Table 7). 
If we could vary this surface freely and if 100 g^ can 
produce 1000 patterns, we could construct 10 
different patterns by varying the surface of BPTI. 
This calculation is intended only to suggest the huge 
number of possible surface patterns based on a common 
protein baclcbone. 

one protein framework cannot, however, display all 
possible patterns over any one particular 100 g2 of 
surface merely by replacement of the side groups of 
surface residues. The protein backbone holds the 
varied side groups in approximately constant locatxons 
5 so that the variations are not independent. We can, 
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nevertheless, generate a vast collection of diff-en^^ 
protein surfaces by varying those protexn resxdues that 
face the outside of the protein. 

Examination of a model of BPTI in contact with 
myoglobin shovs that residues 3, 7, 8, 10, 13, 39 41 
aL 42 can all simultaneously contact a molecule the 
e L Ipe Of myoglobin. Hesidue 49 cannot touch a 

single myoglobin molecule ^^^^^^^'l^ZZ o 
in the first set even though all are on the surface 

BPTI It is not the intent of the present invention 

to use models to determine which part of the 
however, to use ^^^3 ,f binding by 

target molecule will actually ce 



a PBD. 
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Fcr cassette mutagenesis, the protein residues to 
.e varied are, pre.era.ly, close enough in sequence 
that the variegated DNA CvgDNA) encoding all o£ the,, 
can he Le in one piece. The present invention .s not 
United to a particular length of v.^K- ^h- jan h 
synthesized, with current technology, a stretch of 
ammo acids <180 DNX bases) can be spanned. 



One can use wuAxcij- *» 

3 single-stranaed-oligonucleotide-dlrected mutageneses 

^B0TSS5, using two or .ore mutating pr.mers to mutate 

widely separated residues. 

Alternatively, to vary residues -P^^^^^J^^ 
,n than sixty residues, two cassettes may be mutated A 
30 than siXT:y t . ^ 4.^ r^-roduce a population 

first cassette is mutagenized to produce a p p 

i^n^ example up to 30,000 members. Using 
having, for example, ^^assette to 

variegated OCV, we mutagenize a second cassette 
prodl a second variegated population having the 
35 desired diversity. 



other mutational means, such as 
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The composite level =£ variation Mist not exceed 
the prevailing capal.illties to a) produce ve-r7 
nu^ere of independently transformed cells or b, detect 
small components in a highly varied population. The 
limits on the level of variegation are discussed in 
Sec. 13.2. 

we assemble the data about the IPBD and the target 
that are useful in deciding vhioh residues to vary 1) 
3D structure, or at least a list o£ residues on the 
surface of the IPBD, 2) list of sequences homologous to 
IPBD, and 3) model of the target molecule or a stand-in 
for the target. 

These data and an understanding of the behavior of 
different amino acids in proteins will be used to 

answer two questions: 

1) which residues o£ the IPBD are on the outside 
and close enough together in space to touch the 
target simultaneously? 

2) Which residues of the IPBD can be varied with 
high probability of retaining the underlying IPBD 
structure? 

Although an atomic model of the target material 
from X-ray crystallography, mR, etc. is preferred xn 
such examination, it is not necessary. For example, xf 
the target were a protein of unlcnown 3D 
would be sufficient to know the molecular wexght of the 
protein and whether it were a soluble globular protexn, 
a fibrous protein, or a membrane protein. One can then 
35 Choose a protein of Icnown structure of the same class 
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„«A as a molecular stand- 
ana similar .Ue a„a shape to use as a ^ 

^Ar^v At low resolution, axx 
in and V"*-";''- «>e same. The specific 

given size and class 1 ^^^^ spherical 

volumes are the same, all are ^^^^^ 
.„d therefore all prote- o. t^ _ ^^^^^ 

r\rtr:«o mrec:-= determine how much of 
curvature of the two 

the two molecules can come into contact. 

- -rrin:nt;:Ji:--= 

residues o£ the protein ■ interactive 

Should he varied is hy J ^ ,,i,K-figure 

computer graphics, a model , ^^^^ahle 

representation of ^^^^^^^ ,,Lerland PS3.0 graphics 

set of hardware is - corporation, Salt I^e 

terminal (Bvans . Su^erl ^^^^^^ ^^^^^^^^ 

,ity, OT) and a «^cro _ 

(Digital Equipment Corp., protein models 

programs for viewing ^"^Tt I 'ones ,.CHES., 

X DG-TTPono. written by 
include: a) ^^/^ ^.^^^.^i^try Department of Kxce 
and distributed by ^^^,^3, developed by 

university, --^^^/'j^r/.ettericlc (DAYK86) . 
Dayringer, Traitiantano, and 

diich as dynamic 
..eoretlcal ".^-""°»^;^^;\t estimate the 
simulations "^^ " particular residue of a 

----3"°'^- - ^jrrrin" cS: 

rererTpt^icurar suhstitutlon will greatly affect 
the flexibility of the protein. 
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TV»g prin^'^r'"'' set; 
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• -Hh. laiowledge of which residues are on the 
using the Knowj.eay« close 

4.>,o TPBD we pick residues that are 

surface of the IPBD, we p ^^^^^ ^ 

molecule of the target ^^^^^ 

.U^nce (=^ ;-^Xou;L:!. .a.,e. it: a, a .aln- 
residae o£ the IPBD w aistanoe, siSU l-O 

e^in ate. is »it..n «nder J-l^^^,^^,,, „ the 
to 5.0 S ot any atoM of the targ ^^^^^^ 

4. is within Dcutoff °^ , ^oT,i-act 

^^^^ , a side-group atom could make contact 

molecule so that a siae g f differ in size 

„ith that at=.. Because s.de ^^^it^^ l plcXin, 

Table 35), some Judgment - J,, 

In the preferred embodiment, we wii 

Dcutoff ^" * J ,^,,,3 i„ the range ..0 8 to 

Dcutoff - 8-° o at a residue, we 

XO.O 8 could be used. If IPBD^^ ^^^^^^ 

construct a pseudo =beta J^*^^^ residue to 

and angles and judge the ability 
touch the target from this pseudo c^eta- 

^ternatively, we choose a set of rescues on ^e 

V -i->ia-h the curvature or T^ne 
^ ^-p IPBD such tnar t-h^ '-"-^ 

surface of the xir-t^u ^he set is not so 

surface defined by the residues 

^eat that it would P^-'^' /^'^'^f^ ,,,,et. This 

4„ th. set and a molecule of tne iiary 
residues m the set a Mcromolecule, 
method is appropriate it the target i 
such as a protein, because^e ^BBs 
IPBD Will contact only a part ot xn 
surface. 

prefer that there be some indication that the 
we preter 4.oierate substitutions 

underlying IPBD structure will tolerate 
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« Static coBputer modeling, 
a) homologous sequences b sta 
or c) dynamic computer simulations. 

in the principal set need not be 
The residues m the P require only 

..nU^ous in -^ r's^:. to va.ie^ aU 
that the amino „„,,,,ie of the target 

he capahle of ato^ overlap, 

material si»>ltaneou.ly w.^ ^^^^^ ^^^^ myoglobin. 
If the target were, for exa»p , ^^^idues in one 

- " - rsra;.^- - .ah^e 3. coui. .a 

interaction set of BPii 

picked. 

. ^r..^ set contains eight to 
preferably, the principal se ^^^^^^ 
• ^ ^ This niimber oi 
sixteen residues. Tni ^ surface that is 

sufficient --^^f ^^^^ found, but is small 

complementary to the targ ^^^^^^^ 
enough that a significant 
be varied at one time. 
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The srr nn'^''"n^ ^^^^ 
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,he secondary set <=»^"^;%^^ ^^.^.aed fro. the 
residues in the primary eat -d 

primary ae. --^^f t surface, .ut the 
highly conserved, or ) ^^^^^^^ ^om 

e1n:::t "rtt ta^- at the same time as 
rorUtesiles in the primary set. 

-n-houoh frequently conserved 
XntemaX --"^-ji^Itt changes such as X to 

L or F to Y. Th residues and such 

and dynamics of / found, 

variation may be useful once an 
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surf.ce residues In the secondary set are most 
often located on the periphery of 

Which do not make direct contact with the target 
:^ultaneously with ail-other residues of the Pr.n=^^l 
,et The charge on the amino aoid in one of these 
Teiiduer could, however, have a strong effect on 
hilling. Xt is appropriate to vary the charge^of so. 
or all of these residues to improve an SBD. For 
example, the variegated codon containing eguimolar K 
Z I a; hase 1, eguimolar 0 and . at 
base 3 yields amino acids T, K. K, and E with equal 
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The allowed level of variegation that assures 
progressively determines how many residues can be 
varied at once, geometry determines which ones. 

The user plcKs residues to vary in many ways, the 
following is a preferred manner. Pairs of -i^^ "^ 
picked that are diametrically opposed across the face 
It the principal set. Two such pairs are used « 
aelimit the surface, up/down and "'ht/left. 
Lernatively. three residues that form - -scrihed 
triangle, having as large an area as possible on the 
tur'L are picked. one to three other residues are 
":^rin a Ickerboard fashion across the interaction 
s^7ace. Choice of widely spaced residues to vary 
ITeates the possibility for high specificity because 
all the intervening residues must have acceptable 
complementarity before favorable Interactions can occur 
at widely-separated residues. 
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The number of residues picked is coupled to the 
range through which each can be varied by the 
restrictions discussed in Sec. 13.2. In ^^-^^^^ 
round, we do not assume any binding between IPBD and 
the target and so progressivity is not an xssue At 
the first round, the user may elect to produce a level 
Of variegation such that each molecule of vgDNA xs 
potentially different through, for example, -limxted 
Liegation of 10 codons (20^0 approx. = 10l3 . one 
run of the DNA synthesizer produces approximately 10 
.olecules of length 100 nts. Inefficiencies xn 
ligation and transformation will reduce the number of 
proteins actually tested to between 10^ and 5 x 10 . 
Lltiple iterations of the process with such very hxgh 
levels of variegation will not yield repeatable 
results; the user must decide whether thxs is 
important . 

r.. variation at Eanh Site o f 

20 Wn tat ion: 

The total level of variegation is the product of 
the nv^er of variants at each varied residue Each 
varied residue can have a different -J^-^ °^ 
25 variegation, producing 2 to 20 different possibilities 
we require that the process be progressive, i^ each 
variegation cycle produces a better starting point for 
the next variegation cycle than the previous cycle 
produced. 



35 



N.B.; setting the level of variegation such 
that the PEbd and many sequences related to 
the EEbd sequence are present in detectable 
amounts insures that the process is 
progressive. If the level of variegation is 
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so high that the ^ sequence is present at 
such low levels that there is an appreciable 
chance that no transformant will display the 
PPBD, then the best SBD of the next round 
could be woEse than the PPBD. At excessively 
high level of variegation, each round of 
mutagenesis is independent of previous rounds 
and there is no assurance of progressivxty. 
This approach can lead to valuable binding 
proteins, but repetition of experiments with 
this level of variegation will not yxeld 
progressive results. Excessive variation xs 
not preferred. 

If the level of variegation is such that the 
parental sequence and each single amino-acid change is 
present for selection, then we taow that a selected 
sequence is closer to optimal or the same as the 
parent. If, on the other hand, very high levels of 
variegation are used, a sequence may be selected, not 
because it is superior to the parental sequence, but 
because the parental and improved sequences are, by 
chance, absent. 

Progressivity is not an all-or-nothing property, 
so long as most of the information obtained from 
previous variegation cycles is retained and many 
different surfaces that are related to the PPBD surface 
are produced, the process is progressive. If the level 
of variegation is so high that the ppbd gene may not be 
detected, the assurance of progressivity diminishes. 
If the probability of recovering PPBD is negligible, 
then the probability of progressive behavior is also 
negligible. 
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An opposing force in our design considerations xs 
that PBDs are useful in the population only up to the 
amount that can be detected; any excess above the 
detectable amount is wasted. Thus we produce as many 
5 surfaces related to PPBD as possible wxthxn the 
constraint that the PPBD be detectable. 

we defer specification of exactly how much 
variegation is allowed until we have: a) specif ied real 
10 nt distributions for a variegated codon, and b) 
examined the effects of discrepancies between specxfxed 
nt distributions and actual nt distributions. 

n^^-ian of - n^^^ V-nr^odim PBH Family: 

we must now decide how to distribute the 
variegation within the codons for the residues to be 
varied. These decisions are influenced by the nature 
of the genetic code. When vgDNA is synthesized, 
20 variation at the first base of a codon — ^ 
population containing amino acids from the same column 
of the genetic code table (as shown in the Table 3-6 on 
P87 of WATS87); variation at the second base of the 
codon creates a population containing amino acids from 
25 the same row of the genetic code table; variation at 
the third base of the codon creates a population 
containing amino acids from the same box. If two or 
three bases in the same codon are varied, the pattern 
is more complicated. Work with 3D protein structural 
30 models may suggest definite sets of amino acids to 
substitute at a given residue, but the method of 
variation may require either more or fewer kinds of 
amino acids be included. For example, examination of a 
Bodel might suggest substitution of N or Q at a given 
35 residue. Combinatorial variation of codons requires 
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. . M «nd O at one location also include K and 
that mixing N and Q at one one must 

H as possibilities at the same residue. on 
loose to put: 1) N only, 2) Q only, or 3) a mixture of 
n H andQ. The present invention does not rely on 
5 a™ predictions of which amino acids should be 
' ™d at each residue, rather attention is focused on 
which residues should be varied. 

There are many ways to generate diversity in a 

varied as much as possible (intgr alia see C^U85 
varied as m ™,pR6^ we will call this limit 

CARU87, RICH86, and WHAR86) . we wii is 

pocused Mutagenesis is 
-Focused Mutagenesis". Focusea y 

. ^ 1, ^ TPBD or other PPBD shows little or 
IP? aooropriate when the IPBU or <ji^ 

r binding to the target, as at th. begxnn.ng of the 
Tea^l J a protein to bind to a new -"^^^^ 
When there i. no binding between -e J-^ - 

target, we preferably pick a set of 
residule and vary each through all 20 po.s.b.lxt.ea. 

^„ alternative plan of mutagenesis ("Diffuse 
Mutagenesis") is to vary .any .ore residues through 
„ore limited set of choices (See Vershon ^ ^, Chl5 

XK0US6 and PI^S.,. «.is can be 
spiKing each of the pure nts activated for DNA 
^Zis (^ „t-phosphoramidites) with a small 
ZT:. ir-or mole Of the other -iva^d n. 
contrary to general practice, the present invention 
s!™e level of spiHing so that only a small 
pfrLtage ( « to .00001*, for example , of the final 
product contains the initial DNA se<juence. Many 
single, double, triple, and higher mutations occur, but 
ToU Of the basic se^ence is a P-^^-^ J~ 
35 Let % be the number of bases to be varied, and let Q 
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♦•>,=,■»- should have the 
,e the fraction of aU ^^^^^^^^^^^^^^ fixture 
parental sequence, then M, the fractx 
that is the majority component, is 

5 M = expl loge(Q)/Nb > " 

1. r^airs on the DNA 

Chain were to be varied and 1. 
have the parental se^ence 

" substrate --".^l^U Jaction C*n, o. OK. 

cf other nts. Table „j,^„ 3„ bases are 

molecules having n """-P^^""! j^aotion M of 

synthesized «ith ^^^^^ j:^^,,, .34 and higher 
the .aiority -P-ent^ ^ ^^^^^^ ^^^^^ 3 the 

15 are less than 10 ^. Th- en y ^^^^^^^ p„i,ability. 

nu^er of ^"^"'^"^ J^f^,%„hability for multiple 
Bote that substantial P ^^^^^^^^ 

substitutions only """^^^f^^ to around 10"^. 

ifn\ is allowed to drop v, . 
sequence (fO) ^c any part of 

20 Mutagenesis of this s appropriate «hen 

the protein at any tlBe, but ^^^^,i,^ed. The 

so.e binding - ^ lin that are synthesized 

Kb base pairs °^ ° contiguous. They are 

„lth Bixed reagents need not ^^^^^^^^ 

- picKed so that J ,,cKad for .utation 

to various degrees. The 3tructure of the 

are piCed with ,„ht picX all or 

IPBD, if B>°"n- ^='"7 „i„clpal and secondary 

.est Of the residues - - J» ^ .^ant of 

30 set. we may ""P-^ ^ased on homologous 

variation at each of these r^ „on-parental 
seguences or other data. ^ ^.^^^^ 

„ts need not be random, rather specific 
,ive particular ammo ^"^ J^^^.^^. ,or 
35 probabilities Of appearance at 
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example one residue may contain a hydrophobic amino 
in all — homoicous se^ence.; in such a ca e 
nrs. and third hase o. that codon „oui^ be v r.ed, 
but the second would be set to T. Th 
structure-directed mutagenesis will reveal the subtle 
::;s possible in protein baO^one associated 

conservative interior ~ ' =^ ^r^. " co;c::itant 
as some not so subtle changes that require c 
Changes at two or more residues o£ the protexn. 

ror Pocused Mutagenesis, we now consider the 

. that will be inserted at each 

distribution of nts wU ^ ^^^^^^^ 

variegated codon. Eacn coaoi ^.„,4.^„„ that 

differently. « we have no information "'J^'f^ 
, ^articular amino acid or class of ammo acid i 
IJZ^::. we strive to substitute all ».«c acids 
egual probability because representation of one 
l^Zr. L detectable level is wasteful. Equal 
?r!f all four nts at each position in a codon 
th -id distribution m »hich each amino 

Icld is present in proportion to the number of codons 
^ fo^ it This distribution has the 
rdvra^e r g::ing two basic residues .r every 
ncldic residue. In addition, six times as much E, S, 
"d L as W or M occur. It five codons are synthesized 
^L t^ s distribution, se^ences encoding five Ks are 
77,6-times more abundant than sequences encoding five 
Z TO have «-«-«-W-« present at detectable levels, we 
must have R-R-E-R-R P-sent in 7776-fold excess. 

I.t *bun(x, be the abundance of DHA se^ences 
coding for amino acid x, defined by the distribution of 

coaxng x^i distribution, 
nts at each base of the -do- For any 

..ere will be a ^^^^'^^^^l^^^^^^ ' ^J^ .cid 

35 abundance Abun(mfaa) and a least favor 
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♦ \ We seek the nt 
,l£aa, With abundance ,,ias and 

,„.on «,at ano„s ,^.„,.aa, 
that yields the l^J? ,b>„aances o£ acidic 

subject tc two ^^^_t possible nm.ber c£ 

.nd basic -inc ac ds and ^ ^^^^^ 

the function maximized is: 

( (l-Abun(stcp) ) (!>bu„(l£aa)/*bun(m£aa) ) ) . 

«e have simplified the J", ^ ^'c ^ 

aistribution by iimitin, the third « T cr J 

^-=r,■^^ All amino acias are 
G is equivalent) . AXi reduced because TGA 

nu^nber of accessible stop codons xs red ^^^^^ 

C, H, N, I, and D require T ^^.^^^^^ 
M, Q, K, and E require G. Tnu 
.Uure of T and G at the third base. 

=,™ written as part of the present 
A computer program, written a J ^^^^ 
^ "Find Optimum vgcodon lo** 

invention and named Fina p ^ ^^d 2 , in steps 

...ies the :^:%i;rtU It '.ives the 

0.05, and reports ^''^ ,^a,/M.un(mfaa) 

(l-Abun(stop) ) ) }• 

by the nt distribution at each base: 
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T 




C 


A 


G 




tl 




cl 


al 






t2 




c2 


a2 


g2 




t3 




C3 


a3 


g3 




tl 


+ cl 


+ 


al + gi = 


1-0 




t2 


+ c2 


+ 


a2 + g2 = 


1.0 
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t3 = g3 = 0.5, C3 = a3 = 0. 
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variation of t.e ^f^'^o'^^.T'ot^l^n^'^'^- 
a2 and g2 is subject to 
;^;n(E).^un(D) equals Abun(K).i^un(R); 

^un{E).Abun(D) » gl*a2 ^ 
Abun(K)+AbunCR) = al*a2/2 + ci g 

gl*a2 = al*a2/2 + Cl*g2 + al*g2/2 
solving for g2, we obtain 

g2 = (gl*a2 - 0.5*al*a2)/(cl + 0.5*al) . 

In addition, 

ti = 1 - al - cl - gi 
t2 = 1 - a2 - c2 - g2 

. =,7 and c2 and then calculate tl, 
«e vary aX ^^'^ ^'J^,,^, step. o« 5.. 

; J^lly opti.m. distribution of nt. i. 
one. an ^^^'^^^^^^^ ^ e^^iored with steps 

' ■i-termxned, J^'^"^ 3., .^own in Table 9. 

of 1%. The logic of this proy^ 
The optimum distribution is: 



n pt.imum 



vrfCodon 
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T 


C 


A 


G 


0.26 


0.18 


0.26 


0.30 


0-22 


0.16 


0.40 


0.22 


0.5 


0.0 


0.0 


0.5 
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and yields DNA molecules encoding each type amino acid 
with the abundances shown in Table 10. 

The computer that controls a DNA synthesizer, such 
5 as the Milligen 7500, can be programmed to synthesize 
any base of an oligo-nt with any distribution of nts by 
taking some nt substrates (e^ nt phosphoramidites) 
from each of two or more reservoirs. Alternatively, nt 
substrates can be mixed in any ratios and placed in one 
10 of the extra reservoir for so called "dirty bottle" 
synthesis . 

The actual nt distribution obtained will differ 
from the specified nt distribution due to several 

15 causes, including: a) differential inherent reactivity 
of nt substrates, and b) differential deterioration of 
reagents. It is possible to compensate partially for 
these effects, but some residual error will occur. We 
denote the average discrepancy between specified and 

20 observed nt fraction as Sgj-r' 

Serr = square root ( average [ (fobs " • f spec) spec ] ) 

were fobs is the amount of one type of nt found at a 
25 base and f^pec the amount of that type of nt that 
was specified at the same base. The average xs over 
all specified types of nts and over a number (e^ 10 
or 20) different variegated bases. By hypothesis, the 
actual nt distribution at a variegated base will be 
30 within 5% of the specified distribution. Actual DNA 
synthesizers and DNA synthetic chemistry may have 
different error levels. It is the user's 
responsibility to determine Serr 
synthesizer and chemistry employed. 
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TO determine the possible effects of ^ 
composition on the amino-acia aistribution. we »d.f .e. 
^I program "Find Optimum vgOodon" in four ways. 

„ the fraction of each nt in the first two bases 
is allowed to vary' from its optima value t^es X 

s 1 to the optimum value times (1 + S^rr) 
- Serr) ^° " " . hypothetical 

seven equal steps (Sgrr the nyp 

fractional error level entered by the user, • the 
sum Of nt fractions at one base always equals 1.0, 

2) 02 is varied in the same manner a2. we 
aLpped the restriction that «=un(D).«,un(E) - 

Abun(K)+Abun(R) , 

3) t3 and g3 are varied from 0.5 times (1 - S^rr) 
to 0.5 times (1 + Serr) ^^^^ ^^^P^' 

4) the smallest ratio Abun(lf aa)/Abun(mf aa) is 
sought • 

in actual experiments, we will direct the synthesizer 
To : Luce 'the optimum distribution J^imum 

.qcodon.. qiven above. — ^ jruy 
Chemistry may, however, cause us to ^"^^^^^ 
.ollowinq distribution that is the 7^^^ 
Obtained if all nt fractions are °^ 
amounts specified in "Optimum vgCodon . * 
respondi^ table ^an be cal^lated - ^^^^^ 



30 



o^givin distribution." given in Table 11 

..p4-^Tni,^ vacorinn, — ^ '''^ errors. 



using the program "Find worst vgCodon within Serr 
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base #1 = 0.251 0.189 0.273 0.287 
base #2 = 0.209 0.160 0.400 0.231 
base #3 = 0.475 0.0 0.0 0.525 

5 This distribution yields DNA encoding different 

amino acids at the abundances shown in Table 12. 

If five codons are synthesized with reagents mixed 
so as to produce the nt-distribution "Optimum vgCodon" , 
XO and if we actually obtained the nt-distrxbut.on 
-optimum vgcodon, worst 5% errors", then DNA sequences 
^ * an nf the five codons are about 

encoding the mfaa at all of the xxv 

277 times as likely as DNA sequences encoding the ifaa 
al all Of the five codons; about 24% of the DNA 
15 sequences will have a stop codon in one or more of the 
five codons • 

When five codons are synthesized using equijnolar 
,i=:tures at bases 1 and 2, (M,un(Bfaa,/M:un(lfaa) ) - 
,0 "76. Xf we pro^a. the opti^u. nt distribut.cn and 
come within 5%, then ,M=un(mfaa)/Al,un(lfaa) ) - 277. 
The total nuBber of different PBDs is unchanged, but 
the least-favored sequence is about 28 tir.es Bore 
abundant. Detecting the least-favored 
sequence when varying four residues with equimolar nts 
at each varied base requires as sensitive a separation 
system as does detecting the least-favorad amino-acrd 

„>,.n varying five residues with the optimized 

sequence when varyxng i^j-vc 

nt distribution- 

By hypothesis, the distribution "optimal vgcodon" 
is used in the second version of the second variegation 
<,f hypothetical example 2. The abundance of the DNA 
encoding each type of amino acid is, however, ta^cen 
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from the Table 12. The abundance of DNA encoding the 
parental amino acid sequence is: 



amount (parental seq.) 



F24 
= Abun(F) 
= .0249 
= 2.4 X 10 



G30 



D34 



£42 



T47 



* Abun(G) * Abun(D) * Abun(E) * Abun(T) 



X 

-7 



.0663 X .0545 



.0602 X .0437 
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Therefore, DN* encoding the PPBD sequence as well as 
™.nv related sequences will be present m 
ZL::::^^^^^ - .e detected and we are assured 
that the process will be progressive. 

A level of variegation that allows recovery of the 
PPBD has two properties: 

1, „e cannot regress because the PPBD is 
available, 

2, an enormous nu>.ber of multiple changes related 
tl the PPBD are available for selection and we are 
able to detect and benefit from these changes. 

ffl,e user must adjust the list of residues to be 
varied and levels of variegation at each residue until 
t^rcalculated variegation is within the bounds set by 

Mjitv ^sensi- 

preferably, we also consider the interactions 
betwer the sites of variegation and the s™din, 

™- "tnrrsUL,"^:— IXthe 
replacement of a cassetre, 

Jriegation will generate gratuitous rj^^^^ 
and whether they seriously interfere wit: 
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•4-.r We reduce or eliminate 
introduction of a.vars.ty ^ ^ 

gratuitous -"/^^""J^Vilent iteration of colons 
variegation pattern and "l^""^ Detailed 
neighboring the sites of var.egatxon. See th 

5 Example . 



Mirtir -n^wa into 



25 



cor^. 14-1;„ 

>-ocii-riction sites were 
per cassette ""-^ :::rto introduce the 
designed and synthesized »d ^ 
synthetic vgOK. .nto «.e C^- ^ 

and 1^9-^""= ,,,, Of single-stranded- 

■ tide directed BUtagenesis, synthetic vgDHA 
oligonucleot.de-d.rected ^^^^^^ ^^^^^^^ ^ 

is used to create diversity i» 

^A.0^ Transfnmntinn nf cellsj. 

™. present Mention ^ not U»it. « »y ;^ne 
"--^ Of transfo^.ng cells - ^^^^^ 
"T'ldTr t:: p^icur host ceus and ocv. .he 
ranrtTpLuL a n™.-;^ 
transfor^ants, preferably lo' of Bore 

transformed cexxs 
necessary to " reparation. «e prefer to 

transfon.at.on and ^«-^*^ J „tration so that 
have transformed cells at hig 
they can be plated densely on relatively P 
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gon. 14.3: 

The transformed cells are grown first under non- 
The tran expression of plasmid 

selective -"^^^^f^f j"™, untransfor.ed cells. 

genes and then selected to Kill 
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Transfonaed cells are then induced to express the ossr 
Ebd gene at the appropriate level of induction, as 
determined in Sec. 10.1. The GPs carrying the IPBD are 
harvested by a method appropriate to the package. 

A high level of diversity can be generated by fn 
vitro variegated synthesis of DNA and this diversity 
can be maintained passively through several generations 
in an organism without positive selective pressure. 
10 LOSS or reduction in frequency of deleterious mutations 
is advantageous for the purposes of the present 
invention. It is preferable that the selection is must 
be performed before more than a few generations elapse. 
Moreover, subdividing the variegated population before 
15 amplification in an organism by removing a small sample 
(less than 10%) for further work would result in loss 
of diversity; therefore, one should use all or most of 
the synthetic DNA and most or all of the transformed 
cells . 

20 

. Tsol a ^^^^ GPfPBnVs with hiTidinq-to- . 

• target p ii^nntypes ; 

The harvested packages are enriched for the 
25 binding-to-target phenotype by use of affinity 
separation involving target material immobilized on a 
matrix. Packages that fail to bind to target material 
are washed away. If the packages are bacteriophage or 
endospores, it may be desirable to include a 
30 bacteriocidal agent, such as azide, in the buffer to 
prevent bacterial growth. 



sp^c. 15.1: 



2.^^^rMrr^a •h^r-gp.t matPrial t.o a column: 
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Affinity column chromatography is the preferred 
:aethod of affinity separation, but other affinity 
separation methods may be used. A variety o 
commercially available support materials for affinity 
Chromatography are used. These include derivatized 
beads to which the target material is covalently 
linked, or non-derivatized material to which the target 
material adheres irreversibly. 

suppliers of support material for affinity 
chromatography include: Applied Protein Technologies 
Cambridge, MA; Bio-Rad Laboratories, Rockville Center, 
NY; Pierce Chemical Company, Rockford, IL. Target 
materials are attached to the matrix in accord with the 
directions of the manufacturer of each matrix 
preparation with consideration of good presentation of 
the target. 

.o^,,o^.c. selcrHnn dnn to non-specific 

20 binding: 

we reduce non-specific binding of GP(PBD)s to the 
matrix that bears the target in two ways: 

1) we treat the column with blocking agents such 
as genetically defective GPs or a solution of 
protein before the population of GP(vgPBD)s is 
chromatographed, and 

2) we pass the population of GP(vgPBD)s over a 
matrix containing no target or a different target 
from the same class as the actual target prior to 
affinity chromatography. 
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step (1) abov. saturates any non-specific ""^i"^ th.t 
tlfa flnity matrix .ight snow toward wUa-type OPs or 
tne arrim. i removes coinponents of our 

proteins in general, step the 
population that exh^xt non spec.f ^^^^^^^ 

= lUrtrrr:: ntr: ^o^lohin, for exampi. 

rllTTu^ortin, hovine ---/r ^rft: 

trap CPS ----^^ --^^^^^^^ Uet. 
binding to proteins. If onoi 
then a hydrophobic compound such 
tertiarybutylbenzyl alcohol, could 

OPS displaying PB.s having ^^^^ --^-^2^ PBo! 
to prematurely terminated 

that fail to ^.paoity of the 

will he Jri^inat^^^ adhesive 

initial column ^^^1^^:71^ fold greater, than the 
PBDs should be greater (S^ 5 "^"-^ 
column that supports the target molecule. 

variation in the support material (polystyrene, 
20 variarion j.i clones carrying 

glass, agarose, - ^"^'^"^j' Jr pac^cages that 

sBDs is used to eliminate enrichment for pac g 
SBDs IS us „,t.rial rather than the target, 

bind to the support material raiai 

The population of GPs is applied to an affinity 
^trirUer conditions compatible with the intende^ 

"f the binding protein and the POP'^^--^ " 

^ rr-radient of some solute 
30 fractionated by passage of a gradien ^^^^^^ 

^er the column. The P---^-^f^%^^ ,or 

affinity for the target and for wnicn 

the target is least affected by the eluants used. 
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Xons or cofactors ^--^\'ZTT^o.ir^ 
(derived fro. IPBD) - target^^ ^^^^^ ^^^^^^ 

buffers at -PP^°P^^^^\^. ' target by washing the 
GP(PBD)s that do not hxnd^ ^^^^^^ 

5 .atrix with the °^ J ^ ,30 n.) baclc 

to bring the optxcal densxty ( _ ^ ^^^^^^ ^^^^ ^ 
to base line plus one to t increasing: a) 

colu:an is then eluted wxth ^ ^^^^^l^^^^^, solutes, d) 
salt, b) <--7^;^\f decreasing), or e) so.e 

temperature " ' Salt is the most 

combination of f Jf;j,;^ation. Other solutes 
preferred solute for ^^^^^^^ interaction may also 

that generally -"^7;^^^^^^^^^ ,,„taining any of 
be used. "Salt" includes solu 
the following ionic- specxes: 



10 



15 



20 



25 



30 



K+ 


Ca++ 






Li+ 


Sr++ 


Ba++ 






Cl- 


Br- 


CS+ 




HSO4- 


PO4 


HPO4— 


HCO3- " 

Standard 
nucleotides 


Acetate 


CO3 — 

Standard 1- 
Amino Acids 


Guanidiniuifl 
Cl 



35 



40 



Na+ 
NH4+ 
Eb+ 

SO4— 

H2PO4- 

Citrate 

n.ntral solutes may be used. All 
Other ionic or neutral ^^^^ ^^^^ ^,,1 

solutes are subject to the ne^^^^^^ ^^^^^^^^ ^^^^ 

the genetic P-^^^^^^; are frequently used 

ethanol, acetone, ether, ^^^^^ ^re 

in protein P-i^i--- ^/^^J^^.iophage above low 

very — ^ - "^ Jtor.s, on the other hand 
concentrations. Baoteri ^ .^^es. Several passes 

are impervious to most "-"^^^"f;;^. M«erent 
made through the steps xn 
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.olu.es ..y .e us.a in ai«eren. analyse, sal. in one. 
pH in the next, etc. 



^ ..^ ^. p of paclcagesL 

^ «ackaqes that display binding to an 
Hecovery of paclcages th ^^^^^^^ ^^^^^ 

affinity column may be achieve 
including from: 

„ tactions eluted with a gradient as descrlbaa 

T'^rLicns elutea with soluble target .aterial. 

e-i-i-n on the matrix, 
^\ cells grown in si-cu ou ^ 

! cells incubated with parts of the matrxx. 

S) fractions ^^^-^^^^'^^X^^^ 
enzymatically degrading the ImKage n 

target to the matrix, and 

e) regeneration of GPs after degrading the 

packages and recovering OCV DNA. 

to utilize combinations, of these 
It is possible to utili ^ ^^^^ 

T-i- cViQuld be remembered tna-c wna 
methods. It shouia 

.ut the in.o^^-- - Jt .e"ve^ 

25 very strongly preferred, 

material is essential. 

xnadvertent inactivati.n the ^^J^J^ 

• = It is preferred that naximum liBits fer 
deleterious. It is prei denature the 

30 solutes that do not i--^-^^ use 
target or the column are ^^3, 

. . denature the column to exuL. 

conditions that denacur portion of the 

4->,« taraet is denatured, a portion 
s before the targer is possible use as 

------ r:.rar:::: ::erer protein- 

35 an inoculum. As the t.ft' 
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^ other non-covalent molecular 
p.o.ein in-acti.- ana ther ^^^^^ ^^^^^ 

interactions, ^^^^ ^^gi^tly to the target 

.olecular ^^^^^^^l^.l^'J^,^ lat the CPs can not 
molecules on the affinity ^^^^ ^^^^^ 

,e washed off in ^'^'^ '°^;^Jl^,^,r..a. m these 
When very tight bindxng has heen ^^^^ 

cases, .ethods (3) ^^^^^^^J^^, ^l^ic messages fro. 
obtain the bound packages or the gen 
the affinity matrix. 

«-F -J-lie elution 

conditions, to lsolat. SBDs th ^^^^ ^^^^^ _ 

one pH (PHt,) but not at . ^ ^^^j^^^ 

population is ^^f^^^J''^^^ than eluted with 
thoroughly at pHi,. m 

..„er at pHo and =Ps that^o»e o«^^^^^^^ ^^^^ 
collected and cultured. Similar P ^^p^^ature. 

other solution P^^^-^*!^"'^ to a colu^ 

.or e.a.ple, -<^^- ^ J^^^tth salt to remove 
supporting xnsuUn. Mt „^ ,i„te wxth 

OPS with little or - ^-^-^ ,,,,,,, PBDS that 

rsu!ro:Vi:reTn a co^etitive .nner. 



25 _5ec 



4-1,0 vnrichedPasKaa^ 



30 



■ ^-hP selected binding trait are 
Viable GPS medium, or, in the 

amplified by culture in a su. ^^^^ivated. If 

case of phage, infection into ^ h ^^^„^,^„g,aphy , 

the OPS have l^-^"-^;^ J,. ..st be recovered 
rom^e ra^irodS-to a ne. viable host. 
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The probability o separation cycle, 

blndin, by "^^^^f J,.,, se^ences 

rem-r rrrrruty o. .olat., a 
single SBD is 0.10 or higher. 
,0 K = the smallest inte,er>= logio(0.10 N)/logio(Ceff> 

„i= If H were 1.0 x lo'' and Ceff ' 1°' 
por exa^le , .o^) - s.oooo/a-sooo 

''Tir' iereC'we »ould attest to isolate SB.s 
= 2.14. Tnereiot After only two 

.,ter the third -P^^^^ ..IT; L is 

separation cycles «.e P----^^ 

(6.31 X 102)V(l-0 5C 10 } 

isolate SBDs might be profitable. 
• Clonal isolates .ro. the last fraction e^ted^in ■ 

isolates obtained by , if r separation 

a-ffinitv matrix, are culturea. 
erfhUU co^plete.^^^^^^^ — ; - 

of these clonal isolates axti 
25 32, oi uiAc:='=^ none of trie 

4.^^ f-ha-raet) colxHtm. It none 
properties °" '^X*^',,, ^^^^ i.p.oved binding to 

isolated, geneticauy pure ^^^^ 

target, or if K <^=les ^.^^^^^ 
then we pool and culture, in ^^^^ 
30 manner set forth m Sec. 14.3 the 

^^Lrr^ir thrr:-- .y -turing an 
Viable GPS ana ij-"^ repeat 
• taken from the column matrix. We then r p 

inoculum taKen irou ^ i„ qec 15. This 

the enrichment procedure described m Sec. 
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^cxic enriC^ent w continue K=nro» P--= " """^ 
an SBD is isolated. 

the isolated GPS has iwroved 
" - „e determine whether 

5 retention on the (targe ) affinity 
..e retention of ^^^^^Haterial is attached 
for the target . „pti„al density and 

to a different support Batrix are measured, 

the elution volu.es of cand. at -<SBO,s^ ^^^^^^^ 

10 we picK the tamed on the colu^ after 

elution volume or that ^ l,i,ner 
elution. If none of the ca ^^^^^^ _ „^ 

,l,tion volume jast few fractions 

pool and culture th^ G^s J ^^^^^^^^ 
that contained v.^lej^^ „,trix. We 

oulturing an inoculum taKen 
then repeat the enrichment procedure 

, «Ds show binding that is superior 
If all of the SBDS show 

,0 to .PBD of this round, we P°°/- the 
the last fraction that contains viahl 

inoculum taKen from the column. Thi P P ^^^^.^^^^^ 
chromatographed at least one pass 
further the GPs based on Kd. 
25 rv the EHi would 

« an P-;;- rtss^tTnce of a helper 

either be cultured -^^J^ amplified, 
phage or be reverse transcri ^^^^^^^^ 
The amplified DHA could then be q 
30 into suitable plasmids. 

>. „ the population showing 
«e '^^-^J^'^V ^ettfc'and biochemical 

35 desired binding properties r>y g 
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a. we obtain clonal isolates and test these 
:ralns\y genltic and affinity .etUods to deter^xne 
strains oy gen ^esDect to binding to 

genotype and ^^^^^f ^ J^^^JJ^^^^e Isolate, that 
+-a-raet For several geneticaiiy 
. rllnain,. we ae,onst.ate that the ^^^^ 

by the artmolal chimeric gene by 9 the o^ 

gLe ana crossing it into the pa^nt- - we^a^^^ 
-. -^^-t-a 4->ie deleted backbone of eacn 

removed and demonstrate that each bac^one 

,0 t^;r-cannot confer binding to -J--^ J, ^ 
sequence the osprSM gene from several 



We 

isolates . 



15 



20 



25 



30 



T,^.v4.r, of bind_ing^£fi2lityi 

por one or more clonal isolates, we subclone the 
.bd gene fragment, without the ^ fragment, xnto an 
Sression vector such that each SBD can be produced as 
Tfree p°o-in. Each SBD protein is purified by normal 
a free pro .fixity chromatography- Physical 

means, ^^f^^^^^l^f^^^^ ,i.,ing are then made 

T^^^^ -^ - °^ 

irodl: 1) alteration of the Sto.es ra^us a 
.unction Of binding of the target ^^^^^ 'J^^^^^^^^^ 
Characteristics of elution J^^^^^^^ 

Tln^If fTLTcoll^^^^^^^^^^^ - -^-^ 
SBD on. a spun affinity .^^ „f radiolabeled 

the target material, or 3) retention ^. x,=.= 

ra^gHaterial on a spun affinity colunn to „h.ch has 
affixed the SBD. The measurements of b«d.n, for 

been affixes T.a corresponding 

each free SBD are compared to the 

measurements of binding for the PPBD. 

in each assay, ve measure the extent of binding as 
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. ..ntration of each protein, and other 
a function of concentration 
relevant physical and chemical parameters. 

addition, the with ^^i^:::^zz 

, ....et fro. -^^:r^::s. round) and to 
the previous round (IPBD fo ^^^^^^ 

the IPBD with respect ^" J mutagenesis and 

material. — ^^t, increasing affinity 
selection-through-bindxng yield 

,0 until desired levels are achieved. 

■ -.not vet sufficient, we must decide 
If binding is not yet s 
„.icn residues to vary next (see sec. 16.0). 

rpci that bind 
.ACS .ay be used to separate GPs 

^ labeled target with the op 
fluorescent labeiea jj we discriminate 

parameters determined in Par ; '^^^ i^j,ie by 

,0 against artifactual -"^^ ^^^^^^^^^^ ehosen to be 

using two or more differen 
structurally different. 

.Pfinitv separation uses unaltered 
Electrophoretic affinity sep ^.^^ 

,S target so that only other ions in the ^^^^^^^ 

rise to artifactual binding. independent 
the gel material gives rise to retar ^.^.^^^^^^ ^ 
Of field direction and - - ^ 
variegated population of GPs wii 



30 charges. 



35 



the verle,eted population -^^^^j^ 
1^ , ael that contains no target 
electrophoresed in a g ^„„tinues until the GPs 

material. The electrophoresis 
are distributed along the length 
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^ir.y. the initial electrophoresis 
target-free lane xn whxch i^a xn ^^^^^^ ^^^^ ^ 

is conducted is separated by a remo ^j^^ 
.ouare of gel that contains target material . Th 
square or g . , . gecond electrophoresis is 

baffle is removed and a seco 

5 conducted at right angles to ^^""f' ^^^i^e 

not bind target migrate with unaltered -^^^^^'^J^^ 

- - rrgr rrgtin:: :rr- 

that do not bind target. a 

that ao excised and 

^- rp«5 will form. This ime 

binding GPs wiii dissolved and 

10 discarded. Other parts of the gel a 
the GPs cultured. 

The F— ^ v^^iecx^tion Cycl ei 

c the PBD should be varied in the 
ie Which residues of the eau si , ^ ^ 4.n 

next variegation cycleT The general rule is to 
"fserve as much accumulated information as possible- 
preserve ^^^3 best 

The ammo acids 3ust v residues has 

determined. The- environment of 
20 Changed, so that it is appropriate to vary t g 

R^cause there are always more residues in the princ p 
Because there ^.^^ simultaneously, 

and secondary sets than can b ^^^^ 
„^ start by picking residues that either 

^cle does net allow a enough lavel ct dxvarsxty 

^ -.^1 ir, the previous cycle might be 

then residues varied m the prev , 

varied again. For example, if t^^ " 
30 independent transtormants that can be produced nd t^e 
sensitivity of the affinity separation were such that 
siren residues could be varied, and if the principal 
::rsec=ndary sets contained 13 residues^ we wouW 
. always vary seven residues, even though ""^^ 
35 varying some residue twice in a row. In such oases, 
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„ouX. pic. «>e rescues /--^^^^/^Xfe: 
aaino acid, of highest abundance xn the 

codons used. 

It is the accumulation of information that allows 
^„ select those protein sequences that 
Z^:^^ h^en the S.O and 
Interfaces between ----^J ^ ^^^^^^^^ 
twenty or more residues -P^^^^^^ 

10 residues would W*^^ ^ ^„,,«,er In space 

dividing the residues «.at l.e cl 
into overlapping groups of five to 
can vary a large surface but never need to test m 
Tan 10^ to 10- candidates at once, a savings of 10 
15 to 10^'^ fold. 

Having picked the residues to vary, we again set 
Having y ^. .-ch residue according to 

the range of —9^';-"'°^/^;^ f ^be vgDNA 

the principles set forth in 13.2, d g 
the desired mutants (Sec. 13.3), 
20 -'>°°^^"\'^\/;"' , and select-by-blnding-tc- 

vgDNA into GPS (sec. j."*;/ 
target those GPs bearing SBDs (Sec. 15). 

, n. nTTTV.P CONSIDERATIONSl 

25 

v .Tr»in-fc splpntions;. 

one may modify the affinity 
..thod described to select a ^^^^^^^J^^^:.^^^ 
30 material A but not to -Serial B One n 

)-„r, selection columns, one with material 

two selectio population of genetic 

other with aescrlbed, but 

packages is prepared m ^e man 
before applyihg the population to A, P 
35 population over the B column so as to remov 
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does not bind to B Baiore p 4, » and B 

> n,in.tion would most lltely be needed if A and B 
Mplification wouia ^^^^ j^^^ 

were in some ways similar and tne 
selected for having affinity for A. 

Por example, to obtain an SBD that binds A but not 
B, three columns could be connected in series a, a 
clluBn supporting some compound, neither A nor B, or 
coluMi s pp ^ supporting B, and 

only the matrix -terial^ population of GP(vgPBI»s 
^ T"r?erTes "f col^s and the columns are 
::sr wfth tTluffer Of constant ionic strength that 
Is used in the application, .he columns are uncoupled 
Ld the third column is eluted with a gradient 
isolate GP(PBD)s that bind A but not B. 

one can also generate molecules that bind to both 
X and b! in this case we use a 3B model and mutate one 
£a« Of the molecule in guestion to get binding to A 
«rth:l mutate a different face to produce binding to 

B. 

The materials A and B could be proteins that 
differ at only one cr a few residues. For example A 
: Id b! a Jural protein for which the ^ 

dcned and B could ^ :2rto A - 

overall 3D structure of A. SBDs seieci- 
Tot B .ust .ma to . near the residues that are .utated 
in B If the mutations were picked to be xn the actxve 
ite Of A (assuming A has an active site), then an SBD 
that binds A hut not B will hind to the actxve sxte of 
A and is likely to be an inhibitor of A. 

35 



25 



30 
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TO Obtain a protein that will Mnd to >==th * and 
B „e can, alternatively, first obtain an SBD that 
Mnas . an; a different SBO that binds B We can ^ n 
cosine the genes encodin, these doBaxns sc that two 
a donaln single-polypeptide protein ^ 
fusion protein will have affinity for both A and B. 

one can also generate binding proteins with 
affinity for both . and B, such that these .ater.als 
,0 c^Bpete for the sa»e site on the binding P-^""" - 
" g::Ltee competition by overlapping the srtes for * 
1 B. we first create a .olecule - ^ 

material A. We then vary a set of residues ae 
TZ.. residues that were varied to o>=tain Mnd^^^^^^^^^ 

- TL I r^nt";. ^nte^irand so are 

residues of ser ^a; ^ .^^ * „^ r Residues 

unlikely to bind directly to either A or B. . 
in set (b) are likely to xnake s.all changes in the 

in sex. v-^v , . such that the 

affinity to both A and B. 

p—^ sele -*'"" non-bindinq i 

The n-ethod of the present invention can be used to 
selecrproteins that do ^ bind to selected targets 
consider a protein of pharmacological importance such 
as stre^toK^nase, that Is antigenic to an ^J-^ 
30 extent. we can ta.e the P''— ^^^^ Jf/'f 

— - -r-irtcrir r^—rciuy 

::otet wr^be variegated - ..B.S that 
ao not bind to an antibody column would be collected 
35 and cultured. Surface residues may be .dent.f.ed 
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X *^«-m a 3D structure, b) 
several ways, including: a) from a 3D ^^^^ 

f-r "cnr — e" ^^^^^^^^ 

labeling. Tne nref erred guxde to 

important protein J^^ residues 

picUn, residues to va«' » Uttle as 

r.i::::/rorn:.-^ce unaltered. 

— rrfd 

that all or .ost "'^ f/^f^.^, „ould have a set 
in a single molecule. Prefera y, antibody 

- — - rh:r:'s:ieV:T:oncciUi anti.ody 

J could Obtain r ^ -d 

.inking to Stations in one 

then cosine s^e or al important 
molecule to produce a v monoclonal 
p^tein recogni.ed hy none o. ^^^^^^ 
antibodies. Such mutants must be 

rarrdTri^^^^rrref: the mutations. 

^ ^ically. ^-clc^al^-r"-^^^^^^ 

of binding constants for antigen. 

polyclonal antibodxes proceed as 

pharmacologically i»Port»t pro^^^^^^ 
follows. «e engine- the pharm g^ ^^^^^^^^^ 
,0 protein to appear on ^^.^ are on the 

«e introduce mutations protein or 

surface of the phannacoiogj.^»--i ^ .„^,,ce of the 
l„to residues thought ^o be on the = ^^^^ ^ 

^-rt=°^t""s\st:ar: Crclonal antibodies 
35 population of GVS 
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«r,rt the population of GPs is 
„e attacned to a -^^^J'^^'J. .ne colu«, 1= 
applied to the that elute at the 

eluted with a salt ^rad^nt ^^^^^ 
lowest concentration ct s ^^^^ ^^^^ 

pharmacoxogicaxxy »^°f;"'^^^,„,t., binding to the 
mutated in a vay ^^^J^ ,„i„ity for the 
.ntihodies having „. j^e GPs elating 

pharxaacologically J^,,,, cultured. The 

at the lowest salt are ^^^^^^^ 

isolated SBD beco.es the ^ ,,,^i„ants are 

variegation so that 

successively eliminated. 



Fl rr 17.3.1 
15 stEUStiiES 



_SelectiOiL_-2£ — ^ 



20 



25 



, insertions or deletions that 

„3 can select for m ^^.^^ proteins. 

preserve the 3D ^ f ,,^1 on its surface. Xn 

consider on GP that ^XP-- ^f^^ for K2e 

the bEUz^ gene, we can re^^^^ ^3,, , .O^ 

and A27 with j;^;;;^ , turn and are far fro. 

sequences) . K26 and A27 are ^..^.tion-through- 
..e trypsin -^^f J .utants of BPTI that 



Seci__l7i-4J 



_Created_biBto 




int nnique;_ 



30 



35 



are a large number of SBDs 
For each target, there present 
that may be found by ^^^ J .^.at some PBD 

invention. To -rease P ^^^^^^^ ^„ 

in the population wxll bxnd ^„,,,,iently subject to 
as large a 

population as »e ^^^^^^^ ,^ ,..„ager.ent 

selection-through-h.nd^^ Jy transfor^ants can we 
Of the method are 
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,m a component can we find 
produce?.., and "How ^^^^^ ^ . ... . Geneticists 

Lrougl. -lection-through-bxndxn^ . 

routinely find -"J,^^^^^^^^ T.e opti.u. 

,010 ..ing ^^^^^^^ t.e .axi.u. 

level of variegatxon selection sensitivity, 

nu^er of transforxaants and ^^^^ a 
so that for any reasonal^le sensxtxvx y 
progressive process to ^a.n a ^r.e^^ ^P^^^^ ^^^^^^ 

.igher and ^^^^^^^^/'^^^^^ .y a single pass of 

material. ^-^^^^^^^^J ^.^^^ ,,.e ..en demonstrated 
elution from an affinity p 

{SMIT85) . 

°-""t :::;:e„ic,, - P-eau.e can 
re«on (S^ * variation ■ parameters . P=>r 

repeated J^"^^^^ ,,„,,ent residues to vary or 

e«>mple, "^'f """Xtribution at variegated codons 
pick a different nt distriou ^^^^^ 

L that a - ---rif r^^^^^^ set of 

the same residues. Even if different SBD if 

residues is used "^^^^^^ ,o he varied is 
the order in which one piclcs sue 

altered. 

.Odes Of creating diversity - 
..s discussed ^..serves at 

Tfrr'n Of reformation obtained fro. 
isast a large i':^f^^ i^roduces otier mutations m 

the same domain wu--^ 
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produced and the -<»»>\°^/" ^^^,,,,e the preferred 
through affinity separati ^ ^^^t focuses 

«^odlment uses a ""^^""'^ °' "^^J „e BOSt UKely to 
stations into those residues ^at^^ ^^^^^ 

^ rdeX" 't^r^- — - 

•^v,4- allow other GPs 
other modes of mutagenesis mxg 

riophage 

«ri For example, t:ne 
be — Cloning vehicle for cassette 
lambda is not a useful ^^^^^^^^ restriction 

mutagenesis because o smgle-stranded-oligo- 
.i«s. one can. however u^ J^^^^^ ^^^^ 
nt-dlrected mutagenesis 

^i^e mutagenesis to introduce the 

stranded-oligo-nt-directed ,^ 

.evel Of ..ch a method wou.d 

invention, but if it is v 
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■Rvatnple 1 



10 



15 



20 



Phage 



25 



presented below is a hypothetical example of a 

^«in Zy necessary to obtain the desxred results. 
.:r.me Ldlf ications in the preferred .ethod are 
Ilscussed i«.ediately followln, various steps of the 
hypothetical example. 

By hypothesis, we set the following technical 
capabilities : 

500 ng/synthesis of ssDNA 100 bases 
long, 

10 ug/synthesis of ssDNA 60 bases long, 
1 mg/synthesis of ssDNA 20 bases long. 



Mdna 



Ypl 



100 bases 
1 mg/l 



30 



Lef 0.1 % for blunt-blunt, 

4 % for sticky-blunt, 
11 % for sticky-sticky. 
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C^ff 900-fold enrichment 



C=or,=i 1 in 4 X 10^ 
'-sensi 



Nchrom 10 P^^=^= 
Serr ^'^^ 



T^vamplP 1 f Part I 



in this example, we will use M13 as a replicable 
GP and BPTI as IPBD. In Part I, we are concerned only 
with getting BPTI displayed on the outer surface of an 
M13 derivative. • Variable DNA may be introduced xn the 
n.p-i^bd gene, but not within the region that codes for 
the trypsin-binding region of BPTI. Once BPTI xs 
displayed on the M13 outer surface of an M13 
derivative, we proceed to Part II to optimize the 
20 affinity separation procedures. 

For this example, we choose a filamentous 
bacteriophage of coli, M13. We prefer phage over 
vegetative bacterial cells because phage are much less 
.etabolically active. We prefer phage over spores 
because the molecular mechanisms of the virion 
formation and 3D structure of the virion are much 
better understood than are the corresponding processes 
of spore formation and structures of spores. 

M13 is a very well studied bacteriophage, widely 
used for DNA sequencing and as a genetic vector; it is 
a typical member of the class of filamentous phages. 
The relevant facts about M13 and other phages that will 
allow us to choose among phages are cited m Sec. 
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1.3.1. 

compared to other bacteriophage, filamentous phage 
in general are attractive and H13 in particular is 
5 especially attractive because: 

1) the 3D structure of the virion is known, 

2) the processing of the coat protein is well 
10 understood, 

3) the genome is expandable, 

4) the genome is small, 

5) the sequence of the genome is known, 

- . 6) the virion is physically resistant to shear, 

heat, cold, guanidinium Cl, low pH, and high salt, 

20 

7) the phage is a sequencing vector so that 
sequencing is especially easy, and 

8) antibiotic-resistance genes have been cloned 
into the genome with predictable results (HINE80) . 

Other criteria listed in Sec. 1.0 and 1.3 of the are 
also satisfied: M13 is easily cultured and stored 
(FRIT85) , each infected cell yielding 100 to 1000 M13 
progeny after infection. M13 has no unusual or 
expensive media requirements and is easily harvested 
and concentrated (SALI64, YAMAVO, FRIT85) . M13 xs 
stable toward physical agents: temperature (10% of 
phage survive 30 minutes at 850c) , shear (Warxng 
blender does not kill), desiccation (not applicable), 
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radiation (not applicable) , age (stable for years) . 

H13 is stable toward chemicals: pH (< 2-2 
(SHITSS)), surface active agents: not applicable, 
ZZlJJ.i (guanidiniu. HCl = S.o K) , ions (no speo.f.c 
sensitivities), organic solvents (ether and other 
organic solvents are lethal (MAEV78)), E"^"™ 
applicable, HHMb not . protease). M13 Is not Xnown to 
be sensitive to other enzymes. 

„13 genome is 6423 b.p. and the secpience is taown 
(SCHA73). Because the genome is small, cassette 
lutagenesis is practical on ^ M13 ; 
single-stranded ollgo-nt directed mutagenesis (FEITSS) . 
M13 is a plasmid and transformation system xn itself, 
and an ideal sequencing vector. M13 can be grovn on 

strains of sali- The H13 genome rs 
<„ESS78, 1^T85). M13 confers no advantage but 
doesn't lyse cells. The se,r.ence of gene ml is 
^a^, and the amino acid se^ence can be — ^^^J" ^ 
synthetic gene, using MHS Promoter and used in 
conjunction with the ^cl9 repressor. The 1^ 
promoter is induced by IPTG. Gene VIII 
Lcreted by a well studied process and is cleaved 
between and «4. Residues IS, 21, 22, and 23 of 

gene VIII protein control cleavage. Mature gene VIII 
protein maKes up the sheath around the circular ssDN^ 
L 3D structure of fl virion is toown at medium 
resolution, the amino terminus of gene VIII P-*"" 
on surface of the virion. No fusions to H13 gene viil 
protein have been reported. The 2D structure of M13 
coat protein is Implicit in the 3D structure. Mature 
M13 gene VIII protein has only one domain. There are 
four minor proteins: gene III, VI, VII, and IX. Each 
35 of these minor proteins is present in about 5 copies 
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per virion and is related to morphogenesis or 
infection. The major coat protein is present in more 
than 2500 copies per virion. 

Although no fusions of M13 gene VIII to other 
genes have been reported, knowledge of the virion 3D 
structure (BMIN810) makes attachment of IPBD to the 
amino terminus of mature M13 coat protein (M13 CP) 
quite attractive. Should direct fusion of BPTI to M13 
CP fail to cause BPTI to be displayed on the surface of 
M13, we will vary part of the BPTI sequence and/or 
insert short random DNA sequences between BPTI and M13 
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smith (SMIT85) and de la Cruz et aU (CRUZ88) have 
shown that insertions into gene III cause novel protein 
domains to appear on the virion outer surface. If BPTI 
can not be made to appear on the virion outer surface 
by fusing the beti gene to the miscp gene, we will fuse 
bpti to gene III either at the site used by Smith and 
by de la Cruz et al. or to one of the termini. We wxll 
use a second, synthetic copy of gene III so that some 
unaltered gene III protein will be present. 

The gene VIII protein is chosen as OSP because it 
is present in many copies and because its location and 
orientation in the virion are known. Note that any 
uncertainty about the azimuth of the coat protein about 
its own alpha helical axis is unimportant. 

The 3D model of fl indicates strongly that fusing 
BPTI to the amino terminus of M13 CP is more likely to 
yield a functional protein than any other fusion site. 
(See Sec. 1.3.3) . 
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The amino-acid sequence 
called AA_seql, is 

AA seql 
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of M13 pre-coat (SCHA78) 
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5 6 6 7 7 
5 0 5 0 3 
MWVIVGATIGIKLFKKFTSKAS 

eor a-^^'-"= inserting a novel protein 

• J" 3 CP is after .23 .eca.se SP-X cleaves 
domain into M13 CP indicated by the 

«.e /-^^^^f fan ' secreted will appear 

arrow. Proteins ^^^^ -n ^^^^^ 

connected to mature M13 CP at . located 

eeca.se the amino^te^i^s ^ ^JJ 

ir TsXd on the outside o. the virion. 

. ^«=o^ as IPBD of this example (See Sec. 
T It meets or exceeds all the criteria: it 
because -^^^ , „ell >cnown 3D 

1. a small, very sta P ^^^^ ^^^^ ^ 

structure. «arl=s ' £,,g„ent and DHA 

fusion of the pSofi signal papti g 

coding for the mature fom J ^,.o„strating 

to appear in the P-^^^^^" ^.^J,, of BPTI to 
that there is nothing m the structu 

prevent its being secreted. 

Marks et al. (BM!K8') also showed that the 
Marks St aJ^ l removal of one 

structure °^ ^"^^ ^= =f this by replacing 

of the cystine bridges Th^y ^^^^^^ 

both C14 and C38 with either 
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threonines. The C14/C3a cystine bridge that MarKs ^ 
al removed is the one very close to the scissile bend 
tZ., surprisingly, both ''^-^"^l"^;^ 
functioned as trypsin inhibitors. This .ndxcates that 
sm is redundantly stable and so Is likely to fold 
into approximately the same structure despite numerous 
surface mutations. »sin, the knowledge of homolo^es 
vide intta, ve can infer which residues must not be 
;;;;ied if the basic BWI structure is to be maintained. 

The 3D structure of BPTI has been determined at 
high resolution by .-ray diffraction <H"«^;.':/^«^^; 
„^OC84, WL0D3,a, «LODS,b, . neutron 

and by ™r (WAG«87,. In one of ^^J' '^^ 
structures deposited in the Brookhaven Protein Data 
Bank "6Pri", there was no electron density for i58, 
"^'eating that .58 has no uniguely defined 
conformation. Thus we know that the ^'"'l^^^. 
not make any essential interaction in the folded 
structure. The amino terminus of BPTI is very near to 
the carboxy terminus. .oldenberg and Creighton 
reported on circularized BPTI and circularly permuted 
Bl^I (GOI,D83, . some proteins homologous to BPTI have 
more or fewer residues at either terminus. 

BPTI has been called "the hydrogen atom of protein 

eiiVi-ipct of numerous 
folding" and has been the subDecr 

!^erimLntal and theoretical studies ,ST.T8,, S0HW87, 
GOLD83, CHAZ83). 

3 0 

BPTI has the added advantage that at least 32 
homologous proteins are 3cnown. as shown in Table 13 A 
tally of ionizable groups is shown in Table 14 and the 
composite of amino acid types occurring at each residue 
35 is shown in Table 15. 
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BPTI i. freely soluble and is not '--"J^J;"^ 
.. , BPTI has no taiown enzymatic activity. 

Betal ions. BPTI , _ , , ^ io-14 m (TSCHS7) . 

BPTI binds to trypsin, Ka - 6.Q x , ^„ . 

^ • Tf K15 of BPTI is changed to L, 
BPTI is not toxic. If K15 or » „„^,„^ rpti 

there is no measurable binding between the mutant BPTI 

and trypsin (TSCH87) . 

Ml of the conserved residues are burled; of the 
seven fully conserved residues only G37 has noticeable 
Lre The solvent accessibility of each residue in 
:S given i^ Table IS which was calculated from the 
Ttry -evrz- in the Broo^aven Protein Data Ban. with a 
^ It radius Of 1.4 A, the atomic radii g^ven n 
Table 7, and the method of I^e and Richards (LEEB71) 
71 of the 51 non-conserved residues can accommodate 
"for more .inds cf amino acids. By independently 
substituting at each residue only those amino ac ds 
^ ,-1- t-hat residue, we could obtain 
..ready observed ^ ^J^^^ 

rs™S:i lill fcld into structures very similar to 

BPTI * 

BPTI will be useful as a IPBD for "-romolecules 
(See sec. 2.1.1).BPTI and BPTI homologues bind tightly 
and with high specificity to a number of enzymes. 

BPTI is strongly positively charged except at very 
high PH. thus BPTI is useful as IPBD for targets that 
arfnlt also strongly positive under the conditions of 
are not a „ „ , i 21 There exist homologues 

intended use (see sec. 2.1.2). ine . 
of BPTI, however, having guite different 
SCI-III from B^ ^t "'^ 

inhibitor from bovine colostrum at -1). Once a 
; Trivati:. Of M13 is found that displays BPTI on its 



wo 90/02809 



PCr/US89/03731 



10 



15 



20 



130 



25 



30 



surface, the se^ence of the BWX ^"-^^ 
replaced by one of the homologous sequences to produce 
acidic or neutral IPBDs. 

BPTI is not an enzyme (See Sec. 2.1.3). BPTI is 
quite small; if this should cause a pharmacologxcal 
problem, two or more BPTI-derived domains may be Doxned 
as in the human BPTI homologue that has two domains. 

A derivative of M13 is the preferred OCV. (See 
sec 3) . A "phagemid" is a hybrid between a phage and 
a pJmid, and is used in this i^ntion. Double- 
stranded plasmid DNA isolated from phagemxd-bearxng 
cells is denoted by the standard convention, 
PXY24. Phage prepared from these cells would be 
designated XY24. Phagemids such as Bluescrxpt K/S 
(soirby Stratagene) are not suitable for our purposes 
because Bluescript does not contain the 
M13 and must be rescued by coinfection wxth helper 
phage! Such coinfections could lead to genetic 
recombination yielding heterogeneous phage unsuitable 
for the purposes of the present invention. 

The bacteriophage M13 fela 61 (ATCC 37039) is 
derived from wild-type M13 through the insertion of the 
beta lactamase gene (HINE80) . This phage contains 8.13 
3cb of DHA. M13 bla cat 1 (ATCC 37040) is derived from 
M13 bla 61 through the additional insertion of the 
Chloramphenicol resistance gene (HINE80) ; M13 bla cat 1 
contains 9.88 kb of DNA. Although neither of these 
variants of M13 contains the ColEl origin of 
replication, either could be used as a starting point 
to construct a usable cloning vector for the present 
example. 
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OCV for the current example is constructed by 
a process illustrated in .igure 4. A ^^^^^/^^^f^^^^^ 
of all the plasBids and phagexnids constructed for thxs 

Example is found in Table 17. 

' 1 • ^ite-directed mutagenesis, 

vor ss olxgo-nt site uxi-c-- 

— :r ::r :::: 
rer;;.rrrorrr; — . .... 

0 3431-3451 Of PBR322. Not. that pI/=2 and 

derivatives carry the anti-sense strand of the ^ 
™nl m the + DTO strand. The segments are pio^d to 
"hi : Tn .c content and to divide the p=.7 — 
into several segments of approximately equal length. 

The genetic engineering procedures "-'^^^J' 
construct the OCV are standard, using co-ercxally 
Talla^le restriction en.ymes 

conditions. ■ „r: z "s 

.0 purified hy ^^^'^'^'^^^^J^Z/^JTr^t. ^ ^ -rain 
engineered derivatives "^^^'^^^^^^^.^ " „^ „13 
PE384 (F+,Reo-,Sup+,AmpS). Plasmid 
derivatives is transformed into soU straxn PE383(r 
IIZ SUP+ «.pS) so that we avoid multiple rounds of 
,Eec ,sup ,*mp ) j3„i^tion of M13 phage is by 

infection m the cuir.ure. xo^j. 
25 mfectxon .^,.^^3, et al. (SALI64) ; isolation of 

the procedure of Saiivar ei. v 

repllcative form (H.) M13 is by the procedure of 
razwins.i et ^ (^AZW73a and .AZW73b, . Isolation of 
p1:::L c;;tainin, the ColEl origin of replication is 
30 by the method of Maniatis (HJUII82). 

„e picJ. the gene from pBR322 as a convenient 

antibiotic resistance gene. Another ^^^^^^^^^1^ 
such as icanamycin, could be used. The Acs I-to-aal II 
35 fragment of pBH322 is a conveniently obtained source of 
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anv^ and the Col El origin. 

M13mpl8 (New England BioLabs) contains neither Aat 
II nor Acc I sites. Therefore ve insert an adaptor 
5- that allows us to insert the ^/-^^/J"^ 
of PBR322 that carries the em^ gene and the ColEl 
origin of replication into a desirable place in 
M13BP18. M13mpl8 contains a lacm promoter and a lac^ 
gene that are not useful to the purposes of the Present 
invention. By cutting Ml3.pl8 with Avail and M 
and discarding the approximately 600 intervening base 
pairs, we eliminate all recognition sites of several 
enzymes useful for engineering the bpti-gene VIII gene. 
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Tiie following adaptor is synthesized, 
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The annealed adaptor is ligated with RF Ml3mpl8 
that has been cut with both Avail and Bsu36I and 
purified by PAGE or HPLC. Transformed cells are 
selected for plasmid uptake with ampicillin. The 
resulting construct is called pLGl. 

DNA from PLGl is cut with both II and Acc I. 

Mtll-to-AccI fragment of pBR322 is ligated to the 
backbone of LGl. The correct construct is named pI/52. 

The Acc I restriction site is no longer needed for 
vector construction. To eliminate this site, RF pIX52 
dsDNA is cut with ACC I, treated with Klenow fragment 
and dATP and dTTP to make it blunt and then religated. 
The Cloning vector, named pLG3, is now ready for 
stepwise insertion of the osEripbd gene. 
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,« now ready to design a ,ene 
that will cause BPTI-dcains to appear on the outer 
surface of an M13 derivative: LG7. 

.o Obtain a novel protein domain attached to the 
outside Of MX3. « insert ... that codes for . ture 
BPTI after A23 of the precoat protein of M13. Mature 
Bm heX with an arginine residue, which is charge = 
BPTI oegi a-se i is normal in suoh oases. 

10 cleavage by signal peptidase I 

Signal peptidase I (SP-D cuts a =^i-ra of 
protein and BPTI after M3 leaving '>'^"'^« f ^ 
It its carboxy end to the amino terminus of M13 CP. 

.he following a»ino-acld seguence, -^^^^ 
is constructed, by inserting — ^ » 

BPTI (Shown underscored) ^f^^^,,, the 

segaence of M13 precoat P-^^ J ^"^^^^^'^ 
arrow) and before the sequence for the M13 CP. , 

20 AA_seq2 

2 3 ^ ^ ^ 

3, ^^colmiRAii^^ 



10 11 11 12 12 13 

35 eyigyawam?wivLtigiklfkkftskas 
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sequence numbers of fusion protexns refer to the 

unless otherwise noted. Thus the 
fusion, as coded unless ^^^^^^^ 

alanine that begins M13 CP is rere 
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82", "number 1 of M13 CP", or "number 59 of the mature 
BPTI-M13 CP fusion". 

The osE=iEbd gene is regulated by the lacUV5 
5 promoter and terminated by the trpA transcription 
terminator. The host strain of coli harbors the 
laciq gene. The ogEziEM gene is expressed and 
processed in parallel with the wild-type gene VIII- 
The novel protein, that consists of BPTI tethered to a 
M13 CP domain, constitutes only a fraction of the coat. 
Affinity separation is able to separate phage carryxng 
only five or six copies ' of a molecule that has high 
affinity for an affinity matrix (SMIT85) ; 1% 
incorporation of the chimeric protein results in about 
30 copies of the protein exposed on the surface. If 
this is insufficient, additional copies may be provided 
by, for example, increasing IPTG. 

A model comprising M13 coat, after the model for 
fl of Marvin and colleagues ( BANNS 1) , and a BPTI 
domain, taken from the Brookhaven Protein Data Bank 
entry "6PTI", was constructed by standard model 
building methods that insure that covalent bond lengths 
and angles are close to acceptable values. The model 
25 shows that the fusion protein could fit into the 
supramolecular structure in a stereochemically 
acceptable fashion without disturbing the internal 
structure of either the M13 CP or BPTI domain. 
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The ambiguous DNA sequence coding for AA_seq2, is 
examined by a computer program for places where 
recognition sites for restriction enzymes could be 
created without altering the amino-acid sequence. (See 
Sec. 4.3). A master table of enzymes is compiled from 
35 the catalogues of enzyme suppliers. The enzymes that 
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not cut the OCV. (Preferably constructea « 

described above) . 

osin, the procedure given in Sec. 4 3, 

^ ,e„e, such a. j"^-^^; . " the TcV 

restriction enzymes (e^ Ban I or HEh D 
too often to be of value. 

.he entire D«. se^ence =£ the .13^ 
with annotation appears in Table ^^^'^^^^'^^^f^^^'^J' 
.estriotion site, and biologically ^^f^"^'^;^^^, 
viz. the 1^ proBoter, the 1^ operator. ^^^J"^- 
;2^arno sequence, the a.ino acid sequence, the stop 
codons, and the transcriptional terBinator. 

The ipbi gene is synthesized In several steps 
The ipoa V .V, J sac 5.1, generating 

using the method described in sec. 5.1, 

dsDNA fragments of 150 to 190 base pairs. 

• The four steps (See sec. e.l, by ^i''" ^ =^ 
synthetic fragments of the ^.^^J-e (the^ 

gene of the present example) into pI/33 
derivatives are illustrated in Figure 5. 

The sequence to be introduced into pl«3 comprises 
the segment from ESElI to Mill (Table 25) , b) 
spacer sZence (gccgctcc, , and c) the segment from 

t^ hesized from two shorter synthetic oligo-.ts as 
dLcribed in Sec. 5.1 of the generic specification. 

Table 27 shows the antisense strand of the 

_ . ^.fl The 99 base fragment shown in 
sequence to be inserted. The 93 o ,5,. 

upter case letters and J 
C^TCC....CCTTCa-3- - olig.3) is synthesized m the 
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oi^iiarw the 100 base long fragment 
standard manner. SiBilarly, tne 

<,f the sense strand shown in lower case (5 

aLaling, the douhle-stranded region is emended witt 
Te^^w fragment by the procedure given above to ma.e 
Srentlre 17e bases double stranded. The overlap 
^^ion is 23 base pairs long and contains 14 CG pairs 
Z pairs. The C» between and Mali does 

code 'or anvthmg in the final PM .ene, it is 
^ere so that the DKA can be c.t by ^"^^^^^ 
Aeull at the same time in the next step. Bight bases 
^e been added to the left of B^IX and nine b s 
have been added to the left of Saul (same -P--^-^« 
and catting pattern as BsuMD ■ These bases at the 
elte n!t part of the final product, "^^V 
present so that the restriction enzymes can bind and 
cut the synthetic DHA to produce specific sticky ends. 

ae synthetic DKi is cut with both Saul and ElElI 

A f„ =i,ilarlv cut dsDKi of pIC3. The 
20 and is ligated to similarly c 

construct with the correct insert is called pIX54. 

The second step of the construction of the OCV is 
illustrated in Table as. As in the constru^ion of 
pI,G4, two pieces of single-stranded DHA 
sjnthesi-*: a .3 base long fragment of the anti-sen^e 
s«and ending with p25 and a 99 base long fragm^t 
(Starting with plS, . Both the synthetic dsDKA and dsj 
U D«A are cut with both t^ll and A^II and are 
Ugated and used to transform S. — 
carrying this second insert is called pI/35. 

construction of pifie proceeds similarly to the 
construction of pl^5. The sequence is shown m Table 
35 30 The two single stranded segments (one from the 
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the sense strand starting w.th the th-d ^^^^^ 

.odon .or «S, are ;Xrt;c and 

with Klenow fragment. Both the y 

- cut „i«> - L used to 

the appropriate pieces 

transform 1^. soli* 

4^ r^rri is illustrated in Table 
construction of ^^[^ pi.,, 

3. and proceeds " ^ ^ ^ ,one 

pLG5, and pLG6. Tne t:wo y ^^^^^ 

the ----^-td^irot::: innin, «ith 

o£ the codon for Vlio and tn ^^^^^^d with 

- trr/nttirU and ^..e 

rit'thn-oth^X and U. purified, and the 

J' Pieces are ligated and used to transform 
appropriate pieces are g ^,,„„t fourth insert is 
^. .he construct „„,er surface 

called p037 ; the display of BPTI on 
Of is verified hy the methods of Sec. 8. 

• .„ amber mutation of M13 used to 
phages derived from wu. 

standard genetic methods from wtM13. 

™ T- coli strain PE384 in LB 
Phage I«7 is grown on Mli 
hroth with various concentrations of IPTG ad 
.edium to induce the ..p^ ^^^^^ 
Obtained from cells gro«, with "j^;^^;"; 

-1 n TnM TPTG. harvested (See bec» i-'jr 
"thL c^TaAt <sC4,, and concentrated to obtain 
Ttltre of lol^ Pfu/ml by the method of Messing 
(MESS83) . 
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preferred »ethoa o. aeter»lnln, »heth«: 1^7 
displays BPTI on i« .-face <See Sec^ ») - 

derivative of trypsin ^uj-p; 

. filter that allows passage of m*ound tr^ or 
5 on a filter una tvroslne residues and can 

Trypsin contains 10 tyrosine 

with 125i by standard methods, we denote 
be -ol^^^f J „,^„,„. i^eled anhydrotrypsln 

the labeled trypsin as trp . 

. . "SHTro*". Other types of laBeis ca 

.e denoted , tluorescent 

10 used on trp or ^^. J^ ^^^^^^^^ „,3 

Xabel. MTrp. or t^* s l^bel ^.^^ 

:r::-ofVrpT ^ l!o .1 of . suffer Of 10^ 

^;:;!dtt". to PH s.o With 1 ^ 

ml^ure is P--^^.~ ;\t:rprsa,f Of proteins 
with a membrane filter that allows p y 

ller that = 300,000. Filters are soaked m 

smaller that ' analysis, 

buffer containing trp or „ 5 ml of buffer 

The filter is washed twice with 0.5 ml 



,,,.er is wa^ea .wi.. ^^^^^^^ 
r fU^r ^anTSed with a scintillation counter 

r o^r suitable device. If ^ 

then .05 ug of protein can oe 
rid lirrisrto . . lO^ disintegrations / minute on 



25 the filter. 
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^ alternative way to quantitate display of BPTI 

\ r.^ lai is to use the stoichiometric 

on the surface of LG7 is t:o u;, 

^ =r,rj RPTT to titrate the BFii. 

bindin, --^-;XL^;Vnfu/.l Of a pha,e is 
^ solution that titers 10 P ^^^^ ^^^^^^ 

approximately 1.6xiu wj-j:'^ 

infective. The ratio of pfu to total phage c„ be 
determined spectrophotometrically using the molar 
ruction coefficients at - ^ ™ 

for the increased length of LG7 as compar 
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, If a 1 0 ml solution that contains 10^2 
For example, if a i. , n itiM IPTG inhibits 

1.7 pha,e 7" ;f„.V;,C calculate that 
trypsin solutions up to ^-^^ (J^ (4.8 x 10"^ 

molecules of BPTI/D/d-* ^ easily 

a speoiUoa peptiae-lin^ea 

dye, such as Nglpha oen^oy 

=r. affinity column may 
Alternatively, bindxn, to an amn. y 
be used to demonstrate the presence of BPTI 
be used t affinity column of 2-0 mJ. 

surface of phage LG7. ^^^^.^ 
total volume having BloEad Mtr-Sel 
30 mg of AHTrp as affinxty "^/^^i, 
^ T,-«D=H The void volume (Vy) ^"•^'=' 
:„ethod of BioRad. The v ^^^^^^ 

is, by hypothesis, 1.0 ml. 
denoted {AHTrp} . 

. eample Of 10^^ M13^- - 

- - Of 10 ^ the same .uffer 

K2HPO4. the column xs then ^^^^^^^^ 
the optical densxty at 2S0 ^^^^^^^ 
returns to base line or 4 x V, 

t,e Tt e'locHed .«..rp, column at 

LGIO are then appiiea VM,ffer The column 

P-/- - - - r^thTsar b:;.er until the 
i3 then washed again wiU. the^^^ ^^^^^^^^ ^^^^^^^ 

optical density at 230 ^^^^^^^^ 

base line or 4 x Vv have been p 

comes first ^v,, buffered to pH S.O with 

..om 00™. The first KCl 
phosphate IS passed ov ^^^^ . ^^^^ ^ ^ 

gradient is followed by a KCl gra followed 

. o ^ -ST The second KCl gradien-c 

to 5 M m 3 X Vy. The s ^ 

35 by a gradient of guanidmium Cl from 
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2 X Vv in 5 M KCl and buffered to pH 8.0 with 
phosphate. Fractions of 50 ul are collected and 
assayed for phage by plating 4 ul of each fraction at 
suitable dilutions on sensitive cells. Retentxon of 

5 phage on the colman is indicated by appearance of LG7 
phage in fractions that elute significantly later from 
the coluim than control phage LGIO or wtM13. A 
successful isolate of LG7 that displays BPTI is 
identified, the bpti insert and junctions are 

10 sequenced, and this isolate is used for further worlc 
described below. 

If vgDNA is used to obtain a functional fusion 
between a BPTI mutant and M13 CP (vide infra) , then DNA 
15 from a clonal isolate is sequenced in the regions that 
were variegated. Then gratuitous restriction sites for 
useful restriction enzymes are removed if possible by 
silent codon changes. The sequence numbers of residues 
in OSP-IPBD will be changed by any insertions; 
20 hereinafter, we will, however, denote residues inserted 
after residue 23 as 23a, 23b, etc. Insertions after 
residue 81 will be denoted as 81a, 81b, etc^ This 
preserves the numbering of residues between C5 and C55 
of BPTI. Residue C5 of BPTI is always denoted as 28 m 
25 the fusion; residue C55 of BPTI is always denoted as 78 
in the fusion, and the intervening residues have 
constant numbers. 

Should LG7 phage from cells grown with 10 mM IPTG 
30 fail to display BPTI on its surface, we have several 
options. we might try to determine why the 
construction failed to work as expected. There are 
various possible modes of failure, including : a) BPTI 
is not cleaved from the M13 signal sequence, b) BPTI xs 
35 cleaved from the M13 CP, and c) the chimeric protein xs 
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„ade ana cleaved after the elgnal se^e-e ^t th 
processed prcteln Is not Incorporated ^ ^ . 

coat. Bm has been secreted from ccli (MM^86). 
roler the «X3 coat-protein si^al se^ence vas not 
used. Therefore problems stewing from the sxgnal 
sequence are unlikely, but possible. «e couW 
■ „.th=-r BOTI was present in the periplasm or 
determine whether BFTI was pi . . ^ ^ ^ons bv 

bound to the inner membrane of 1^7-infected cells by 

assays using try* or Mtry*. 

proteins in the periplasm can be ^--^ 
spherlplast formation using lysozyme and EDTA m a 
conc^trated sucrose solution (BIKD67, M^64, . If 
wtre free in the periplasm, it would be found rn 
™.tant Try would be mixed with supernatant 
a^non-denatunng molecular sizing 
ToL: :„d the radioactive fractions collected. e 

^dicactive -tritL? ::rty"i:rer. 

PAGE and examined for BFTi size 
staining • 

spheroplast formation exposes proteins anchored in 
the in^er membrane, spheroplasts are -''f ^^^^^ 
and then either filtered or centrifuged « -parate 

J »urnv.«* After washing witn 
them from unbound AHTrp*. 

rrtonic buffer, the f^'^^^^^.:::.^"'^^^^ 
extent of AHTrp* binding alternatively, 
proteins are analyzed by western blot analysis. 

If BPTI is found free in the periplasm, then we 
would expect that the chimeric protein was being 
Ileaved 1th between BPTI and the H13 mature coat 
se^ence and between BPTI and the signal se^ence In 
th^ case, we should alter the BPTI/M13 CP :un t n by 
inserting vgDKA at codcns for residues 73 82 
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AA_seq2 . 

If BPTI is found attached to the inner membrane, 
t^en Lrf are two lilcely explanations. .he first xs 
S ^t t" Chimeric protein is .eing cut after the sxgn^ 
5 that tn incorporated xnto 

sequence, but .s \ i^^e^ vgDNA 

Virion; ..e alternative 

.etween --^-^J^^^^j; "Z,^- J,, and react with 
hypothesis IS that BPTi couj. 

.0 :Sp3in even 1. .i^- ^^^^CrLain, 
terminal amino acid sequencing of tryps 
^I^rlal isolated from cell homogenate determines what 
material isoia^: sequence were being 

processing is occurring. If signal q 
Leaved we would use the procedure above to vary 
cleaved, svibsequent passes would 

IK -r^^sidues between C78 ana ao^, 

aHsiaues a«er residue SI. If signal -"^^^ - 

and 27 of M._seq2- Subsequent passes tnro g 
process would add residues after 23. 



If BPn were found neither in the periplas.. nor on 
the inler me^rane. then we would expect that the fault 

Z Tthe Signal se^ence .^^J^Z^Zr^ '^ 
BPTI junction. The treatment m this case wouia 
25 vary residues between 23 and 27. 

several experiments that introduce -riegation 
into the bEtizgi.. vm f-ion are possible, including. 

1) 3 variegated codons between residues 78 and 82 
using olig#12 and olig#l3, 

2) 3 variegated codons between residues 23 and 27 
using olig#i4 and olig#l5, 

35 
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3) 5 variegated codons betve«. residues 78 and 82 
using olig*13 and olig»12a, 

4, 5 variegated codons between residues 23 and 27 
using olig#i5 and olig*l4a, 

5, 7 variegated oodons between residues 78 and 82 
using olig#13 and olig#12b, and 

e, 7 variegated oodons between residues 23 and 27 
using oligllS and olig*14b. 

I. alter t.e Bm-M13 OP Junction, we introduce 
OK. vlriegated at codons .or ^----^-J ^1:: 
into t.e ^ X and ali X ; J,^ in a.ino 
after the last composition 

acid seguences ^l^Zi^^^^ are denoted as 

^d length, .n T^le 25 th se 

:"'t::°;sTs2 .83, td .8.. one o« the oligo-nts 
denoted as h82, ^ ,,„.r2h and the primer olig#13 

olig#12, olig#12a, or olig#12b and P 
are synthesized by standard methods. The 

are: 

..gcrgir=-i-^i-^i-=i^=i-''^i'*'''^''^''^'' 

cSi|cll|i5lcrc|cl'c|^l/c^|cl=l,cg|cc 3. oUgti2 
s.gciragl^lc|xJllcl?|.rc|Tlc|,«|gSlglil"^^l"'^|- 

82 83 84 85 86 87 
GCTlGAA|GGT|GATlGATlCCGl- 

acclililGCGlGCClgcgicc 3- olig#i2a 
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•j,,^ IK Td 77 78 79 80 81 81a 81b 
gc 1 Sg I^So I aJI I CGT I 1 TGC I ,fK I 1 qf* 1 gfk I qflC 1 - 

81c 81d 82 83 84 85 86 87 
qfk 1 qfk I GCT 1 GAA I GGT 1 GAT I GAT 1 CCG 1 - 

88 89 90 91 
GCClAAA|GCGlGCClgcg|cc 3' olig#12b 



5T^S?c1g?c|c^?|t??|gIc1cIg1a?c 3. olig#13 



where q is a mixture of (0.26 T, 0.18C, 0.26 A, and 
0 30 G), f is a mixture of (0.22 T, 0.16 C, 0.40 A, and 
0.22 G), and k is a mixture of equal parts of T and G. 
20 The bases shown in lower case at either end are spacers 
and are not incorporated into the cloned gene. The 
primer is complementary to the 3- end of each of the 
longer oligo-nts. One of the variegated oligo-nts and 
the primer olig#13 are combined in equimolar amounts 
25 and annealed. The dsDNA is completed with all four 
(nt)TPs and Klenow fragment. The resulting dsDNA and 
RF pLG7 are cut with both gfi I and Ssh I, purified, 
mixed, and ligated. This ligation mixture goes through 
the process described in Sec. 15 in which we select a 
30 transformed clone that, when induced with IPTG, binds 



AHTrp. 



40 



TO vary the junction between M13 signal sequence 
and BPTI, we introduce DNA variegated at codons for 
residues between 23 and 27 into the Seh I and JOio I 
sites of pLG7. The first three residues are highly 
variable in amino acid sequences homologous to BPTI. 
Homologous, sequences also vary in length at the ammo 
terminus. One of the oligo-nts olig#14, olig#14a, or 
olig#14b and the primer olig#15 are synthesized by 
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Standard methods. The oligo-nts are: 

1-7 18 19 20 21 22 23 24 25 
residue : _ , _13 , , , Jl^, , rnr^ i ttt I GCT I of k qf k 



5^'1|gccigcGlGiA!c;llAiG|:CTGlTCTlTTTlGCT|qfk|qfkl- 
lqfklT?ClTCTlCT?|GiGlcgc|ccg|cgal 3' olig#14 

5.gi&G|GS|ciGlAiGlCTG|Tci|T?T|GC?^ 
,|?k|qSlTT'clTG?|c?clGAGlcgclccg|cgal 3' olig#14a. 
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5.glgcc|gcG|GTllccGlAiG|c^^ 

5' ItcglcgglgcglCTClGAGlACAlGAAl 3- olig#l5 

mixture of (0.26 T, 0.18 C, 0.26 A, and 
0 30 G), f is a mixture of (0.22 T, 0.16 C, 0.40 A and 
0:22 G), and k is a mixture of equal parts of T and G 
The bases shown in lower case at exther end are 
spacers. One of the variegated oligo-nts and the 
35 primer are combined in equimolar amounts and annealed 
L ds DNA is completed with all four (nt)TPs and 
Klenow fragment. The resulting dsDNA and RF_ pLG7 are 
cut with both Spn I and Xho I, purified, mixed and 
. . n«.-Kion mixture goes through the 



30 where q is 



40 



The ds DNA is completed with all 

Klenow fragment. The resulting dsDNA and RF_ pLG7 are 
cut with both Spn I and Xho I, purified, mixed and 
ligated. This ligation mixture goes through the 
process described in Sec. 15 in which we select a 
transformed clone that, when induced with IPTG, binds 
AHTrp or trp. 
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If none of these approaches produces a working 
Chimeric protein, we .ay try a different signal 
seguence, or a different OSP in H13 (e.g the ger^ XX 
protein for which there is fusxon data (SMIT85, 
5 CRUZ88)), or another genetic package. 

Yy^^T^i^ 1 ^ Part II 

BPTI binds very tightly to trypsin 
(Kh = 6.0 X 10-14 M) and to anhydrotrypsin, so that 
these molecules are not preferred for optimizing the 
amount of BPTI to display on 1^7 or the amount of 
affinity molecule to attach to the column. Tsches^e 
^ ^ reported on the binding of several BPTI 
derivatives to various proteases: 

Dissociation constants for BPTI derivatives. Molar, 
pancreas) pancreas) 



25 



lysine 6.0 x lO"!* 9.0 x lO'^ 

glycine 

alanine + 

valine 

leucine " 



pancreas) leukocytes) 
3.5 X 10-S 
-9 



+ 7.0 X 10" 

2.8 X 10-8 2.5 X 10-9 
5.7 X 10-8 1.1 xlO-10 

1.9 X 10-8 2.9 X 10-9 



30 



From the report of Tschesche et al. we infer that 
molecular pairs marked have K^s greater than 

3 5 X 10-6 M and that molecular pairs marked have 
Kas much greater than 3.5 x lO'^ M. Because of the 
wealth of data about the binding of BPTI and varxous 
mutants to trypsin and other proteases (TSCH87) , we can 
proceed in various ways. (For other PBDs we can obtaxn 
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two different monoclon.! antibodies, one with a high 
two aiJ-i. „3^th a 

affinity having Kd of order 10 h, .^-e „ ^ 

.Odette affihitv ha^. ^ ^ ^^a— ^ 
In this example, we may use. a) ^n ,„,ti.ll b) 

^ iBjikocvte elastase (HuLEl) , "i 

5 between BPTI and human leukocyte .i.^tase to 

the moderately strong binding of poro.ne elastase to 
^I(V15), or =) the binding of BPTI(A15) (residue 38 
TZ L 9ene, for trypsin <weaK but detectable, or 
for porcine pancreatic elastase. 

„e compare the retention of 1^7 virions to the 

retention of wild-type H13 on ,;^P.. M13 ^-"^"^^ 
retention „iid-type M13 have corresponding 

:::::: ::::onr r: .e win create px.s that differs 

PI., only in having stop --/^/te"^^ 
3 and an altered I, codon at codon 7 of the 0«v P - 
le Phage LCS will have exactly as much DHA as I«7 
therefore the W8 virion is exactly as long as the Ifi7 
can not, however, display on its 

20 surface. 

TO expedite identification of different M13- 
derived phage, wa replace the ma gene of 
tetR gene from pBR322 by standard methods. ^he BS^ 
.3 -ring fragment of P^^ xs -gated 

into ON. fro. pI^S cut with ^^ ^'^^^^'J^ 
correct construction, havxng ^'^ 
distinguished froia pBR322 and is called LGIO. 

T/-^ arown at various levels of IPTG 
30 The phage LG7 xs grown previously 

in the medium and harvested m the way P 
described. An affinity column having bed volume of 2^0 
Tand supporting an amount of Hul.1 ^^^^J^^ 

-^n 0 ma on 1 ml of BioRaa ax^-l 
3, r:o<™> rffi^l r^.™. is designated .Hu1.1,. 
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. ^ ™rlat. =et =t densities of HuX^l on the -1^ 

is (0.1 «9/ml, 0.5 mg/Ml, 2.0 mg/Bl, 8.0 mg/ml, 15.0 
^A, and 30.0 ^/.l). The W o. ,HuI^l, is, .y 
^ot;esis, 1.0 ^. *e elution of 1^7 phage - 
, c"pared to the elution of 1^10 on (HuX^ ) having 
va^ing amounts of Hul^l affixed. The colu«.s are 
eluted in a standard way: 

1) 10 m KCl buffered to pH 8.0 with phosphate, 
until optical density at 280nm falls to base Ixne 
or 4 X Vv, whichever is first, 

2) a gradient of 10 mM to 2 M KCl in 3 x Vv, pH 
held at 8.0 with phosphate, 

.1. -5 M -t-o 5 M KCl in 3 X Vv/ 

3) a gradient of 2 M to a n 

phosphate buffer to pH 8.0, 

4) constant 5 M KCl plus 0 to 0.8 H guanidinium Cl 
in 2 X Vv, with phosphate buffer to pH 8.0. 

The preferred level of induction (IPTGoptimal) 
amount of affinity molecule on the »atr.x 
(Do.«oM,p,i.al) are those settings that g.ve Jhe 
Sharpest' 1^7 elution pea. that shows -^^^^^^ 
retardation as compared to LG8, which carrxes no BPTI. 
By hypothesis, the best separation occurs for the 
alount of BPTI/GP produced when the cells are xnduced 
with 10.0 UM IPTG and when 4.0 mg HuLEl/ml is applied 
to BioRad Af f i-Gel 10 (™> . 

When the amount of BPTI/GP and the amount of 
HuLEl/volume of support have been optimized, we turn to 
optimisation of elution rate, initial ionic strength, 
and the amount of GP/ (volume of support). These 
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parameters can be optimized separately. 

using optimal BPTI/GP and HuLEl/volume of support, 
we measure the elution volume of ^1 and LG8 for 
5 different elution rates, viz. 1, 1/2, V^, 1/8 and 1/16 
times the maximum flow rate. By hypotheses 1/4 of 
maximum elution rate is better than 1/2, but 1/8 xs 
about the same as 1/4. Therefore 1/4 maximum elution 
rate will be used. 

Elution volumes of LG7 obtained from cells grown 
on media that is 2.0 mM in IPTG are measured at optimal 
DoAMoM and elution rate for loadings of 10 10 , 
loll and 10l2 pfu. By hypothesis, 10^2 pfu of pure 
15 LG7 'overloads the column and significant number of 
phage elute before their characteristic position m the 
KCl gradient. We also find that lO^ pfu overloads the 

oniv sliahtlY, and that 10^0 pfu does not 
column only siigntiy, cui^ ,^»*jr,i+-v 
overload the cclmon. Because the use =f the e£f.n.ty 
30 separation in Sec. 15 will involve a population „ 
which no single Be>*er is Bore than one part in 10 , we 
conclude that lol^ pfu of a variegated population could 
be applied to a column of 1.0 b1 matrix volume without- 
overloading with respect any one =P«=i«^- 
overloading of a 1.0 ml colu^ by lO^^ pfu also 
indicates that the initial column that captures 
indiscriminately adhesive phage should he 5 to 10 times 
as large as the column that supports the target 
material . 

Elution volumes of LG7 and LGIO obtained from 
cells grown on media that is 2.0 it^ in IPTG are 
:„easured at optimal conditions and for a loading of 
IQlO pfu for various initial ionic strengths: 1,0 m, 
35 5.0 mM, 10.0 mM, 20.0 Ml, and 50.0 mM. We may find, 
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for example, that LGIO is slightly retarded by the 
column When loaded at 1.0 m KCl, but that LG7 always 
comes off the column at its characteristic place in the 
gradient. We use 10.0 mM as initial ionic strength xn 
5 all remaining affinity separations. 

To determine the sensitivity of chromatography of 
phage that display variants of BPTI on their surfaces 
(Sec. 10.1), we prepare artificial mixtures of two 
closely-related phage that differ only at one residue 
in the BPTI domain. One variety of phage has strong 
affinity for the column used in this step, while the 
other phage has no affinity for the column. We 
chromatograph these mixtures to discover how little of 
the phage that binds to the column can be detected 
within a large majority of phage that do not bind the 
colinnn • 

For these tests we choose AHTrp as AfM(BPT:i) . A 
column having 2 ml bed volume is prepared with 
(DpAMoMopti^al ^ AHTrp)/(ml of Affi-Gel lo( >). 
The column is called {AHTrp} and has = LO ml. 

A new phage, LG9, is prepared that displays 
BPTKVIS) as IPBD in contrast to LG7 that displays 
BPTI(K15, wild-type) as IPBD. Residue 15 of BPTI is 
residue 38 of the osp-ipbd gene. We introduce the 
Change K38 to V by replacement of a short segment of 
the o.T.-iT>bd gene between Apa I & §tu I. The correct 
construction is called pLG9 . To expedite 
differentiation between LG7 and an LG9 -derivative 
phage, we replace the gene of LG9 with the tet 

gene from pBR322. DNA from pBR322 between BsmI (1353, 
blunted) and AatH (1428) is ligated to dsDNA from pIX39 
cut with Xbal (blunted) and AafeU- The correct 
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. c 2 kb is easily distinguished 
construction, havxng 9.2 ^, ^^^^ ^^^^^ 

re^er^^ .ene to con.ir. the construction. 

. LGil are grown with optimuia IPTG (2.0 m) 
LG7 and 1X311 are gr . ^ ^he ratios 

and harvested. Mixtures are prepar 

]jG7:LG11 :: I'-^lim. 

^.n™ lolO to 105 by factors of 10. 
Where Vu. -nges fro. ^^^^^^ ^^^^ ^ ^^.^ 

Large values of V^^ ar ^^^^^^ ^^^^^^ 

found that allows recovery of LG7, 

^ n« first blocked by treatment 
The coluian (iUITrp) is firs ^ 
,011 virions Of H13^-S m 100 ul ^^^^^^ 
suffered to pH 8.0 with phosphate, the - ^^^^ ^^^^ 
.i^ the same buffer whichever 

- ' ^ .^\"^%nr :f tt lfxtures of ..7 and U... 
comes first. One of ^^^^ ^^^^^^ ,3 

containing id^ pfu m ^ . ^ ^ standard 

applied to (AHTrp). The coluinn xs el 



way 
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30 



^ J. v,H R 0 with phosphate, 
1) 10 .nM KCl buffered to pH S-O » P ^^^^ 

^- -I riansitv at 280nm falls zo 
until optical density affluent), 
or 4 X Vv, whichever is first, (a 

« +-« 9 M KCl in 3 X Vv, PH 
gradient of 10 ir>M to 2 M K 

\t 8.0 with phosphate, (30 X 
fractions) , 



2) 
held 



35 



3) a gradient of 
phosphate buffer 



2 M to 5 M KCl in 3 X Vv, 
to PH 8.0, (30 X 100 ul 
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fractions) , 

„ constant 5 M KCl plus 0 to o.s « ^-^^f " 
ii 2 X Vv, with phosphate buffer to pH 8.0, (20 

100 ul fractions) , 

V «^ M KCl plus 0.8 M guanidinium Cl in 
5) constant 5 M kux px" „ « n fl2 x 

1.2 X Vv, With phosphate buffer to pH 8.0, (12 

100 ul fractions) . 

, „f 4 ul from «ch fraction are plated at 
samples of 4 ul from ,. g^^* cells (so 

suitable dilution on P^^^^-^^^^ I, column 
«... „3^4. w^^^^^^^^^ 

^ttxx .s transferred to ampicilUn- 

SUP+ cells. ^■^^■'"^J , tested for 

containing IB agar, and top <= 
display Of BPTI(K15) by use of trp. or MTrp.. 

- 4 0 X 10^ is the largest 
Bv hypothesis, "^Hm ~ ^' ^ n • = 

v,-.h T^7 can be recovered. Thus Csensi 
''T 10°= T^ee o^clet of chromatography are retired 
isolat'e ^. souths first approximation to c,„ is 
740 ( ' exp( logeC^-O ^ ' 

now determine the efficiency of the affinity 

tfoT.Sec 10.2). This is done by. a) preparing 
separation (See. 10 ^ ^^^^^ ^^^^ 

ti^ f" for one separation cycle, and c, 
the f o, ,,07 in the last phage- 

determinmg the fraction , colonies 

bearing fraction. «hen 0 1= l-B - ^0 ■ V, ,o% of the 
are BMI positive. «hen Q i= 1-5 " ' . 
colonies are BPIl positive. Thus we calculate C^ff 

.60 X 1.5 X 10^ - 9°°- 
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BPTI do^ms on J« 3„ th.t expression 

^aer control ot the^ P manipulated ^ [1^=1 • 
levels Of BFII-M13 CP can ^^^^^^^^^ 
This construct may be used ^^^^^ 

all based on BPTI. »n '-'i' 

riXfirr amount ^^--J— ^rS^^^^ 

- m,/,ml - ---'UrpU- tr;olumns in t.is 
target molecules will app ^ ^^^^^ 

, amount in the process ^^^^^f jr;argets and all 

— -els .a^- a -^^^^^^^ ^ „.3 
variegation^ Of Bm dx PJ^^^^ °P-^^^-^^°" 
based on LG7, but temperatures are used, 
needed if other values of pH or temp 

5 „av be substituted for 

otner ,^ gene ,,,, u.elibood 

the mX gene fragment - "^^^ J „ev LS7 

that PBD will appear on the surface 

derivative. 

'° lxas!Ele_i— 

-t-vtjical protein target; an 
^^trbe used ^ -tisfies all of the 
other protein could be used. ^^^^ ^^^^^^ 

25 criteria for a target: J attachment it 

applied to an affinity matrix 2, ^^^^^^^^ ^^^^ 
is not reactive, and 3) af ^.^^.^^ 
sufficient unaltered surface to allow sp 

by PBDs. 

30 ^. HHMb is known: 1) 

The essential p„ 4.4 and 

^ is stable at least up to 70 c 

,.3, HHMb is stable «P - . ,,.„o„, 5, 

the PI of HH»b IS 7 0, f ,„teolytic 
35 HHMb requires haem, 6) 
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activity. 

zt::zzz:t:,j ..axe „y=,io.in 

HH«b, 2) the 3D stru i3 generally 

toxicity . 

we set the specifications of an SBD as : 



1) T = 25°C 



15 



2) pH = 8.0 



20 



25 



30 



35 



3) Acceptable solutes : 
A ) for binding : 

i, phosphate, as buffer, 0 to 20 «, and 

ii) KCl, 10 mM, 
B ) for colvnnn elution : 

i) phosphate, as buffer, 0 to 30 M. 

ii) KCl, up to 5 M, and 

lii) Guanidlnium CI, up to 0.8 M. 

4) Mceptable Ka < 1.0 x ""^ ». 
«a choose LG7 as GPCIPBD). 

Hesidues to be varied are picted. In part, through 

- use Of — ^ — onnrii:srdu= 

We-piTa set of 
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.esMues to vary includes: the 3D '\ 
solvent accessibility o. J,"^^:l, Jous 

co^ilatlcn of sequences °' „ 
to BPTI, and 4) knowledge o£ the structura 
different amino acid types. 

Tables 16 and 34 indicate which residues of BPTI: 
hars:^stantial surface ^^J^^^^^ 

:::s" f ei,ht to twenty residues that are e'cposed and 
:riahle a-^d such that all meters of one set can touch 

. .oiecuie Of the -r;rr:dur;nara:^^^^^ 

,.s a small aB.no c.d at = contact the target 
-y not he able ^^^^^^ ^ 

'"TtZJ r chlr^'d amino acid mi,ht affect 
Mn:inr"tro;t ma«n, direct contact. In such cases 
thf resile should be included in the interaction set. 
:r a notation that larger f ^^^e^ 
onTfli-iar wav, large amino acids near tne geo 
L'fTheliieraction set may prevent residues on 

"tier side Of the lar,e central residue from ma..n, 
either side <>« . ^^^^ however, 

simultaneous contact. If a smai± 

Le substituted for the large amino acid, then the 
were and residues on either 

surface would become flatter and re „.iaue 
side could ma.e simultaneous contact. Such a residue 
Should be included in the i"---- f 
notation that small amino acids may be useful. 

,eble 35 was prepared from ^^-^""^ j;"/, 

and Shows the maximum span between c^eta f 
Tal type of side group. =beta i= 

S rifidly attached to the protein main-chain; rotation 
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^cut the 0aipsa-=beta most import^t 

degree ot freedom for determining the location of the 
side group. 

Table 34 indicates five surfaces that laeet the 
given criteria. The first surface comprises the set of 
residues that contacts trypsin in the co.pl- of 
trypsin with BPTI as reported in the Broolchaven Protein 
Data Bank entry "ITPA". This set is indicated by the 
number "1". The exposed surface of the ^e-^*^--; 
this set (taken from Table 16) totals 1148 a2 and the 
approximates the area of contact between BPTI and 
trypsin. 

other surfaces, numbered 2 to 5, were picked by 
first picking one exposed, variable residue and then 
picking neighboring residues until a -^-^--^ 
defined. The choice of sets of residues shown xn Table 
34 is in no way exhaustive or unique; other sets of 
variable, surface residues can be picked. Hereinafter 
we refer to K15 as being at the top of the molecule, 
while the carboxy and amino termini are at the bottom. 

solvent accessibilities are useful, easily 
tabulated indicators of a residue's exposure. Solvent 
accessibilities must be used with some caution; small 
amino acids are under-represented and large amino acids 
over-represented. The user must consider what the 
solvent accessibility of a different amino acid would 
be when substituted into the structure of BPTI. 

TO create specific binding between a derivative of 
BPTI and HHMb, we will vary the residues in set #2. 
This set includes the twelve principal residues 17 (R), 
19(1), 21(Y), 27(A), 28(G), 29 (L) , 31(Q) , 32(T), 34(V), 
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111 1 ^ None of the 
48(A) 49(E), and 52 (M) (Sec. 13.1.1). ^ 

the. -"^^ ^^^'^'''J.^^.ent substitution at each 
underlying structure, m p 

:::: r^ci proaL a^^ateiv 

rrr'^liraci: se^e„ces ana the sa.e nu^er o. 

surf aces ♦ 

3^. is a very .asic protein J^is propert. has 

,een used in isolating and '^^'^^ J^^^,^, ^ 

-i-hat the high frequency of argxnx^i 
hoBClogues so '"^-^ J*; / ^,,3 i„ isolation and is 
lysine residues may reflect hi^ „aeed. 
not necessarily required by the J ^^^^^^ 
SCI-III from IfiEtm 12Ei contains 
than basic groups {SiSMi) ■ 

residue 17 is highly variable and fully exposed 
can contain K, K, H, ^^J'/^^^M^, 

S. All types of a-ino „o acidic 

charged, neutral, and .f^^^ 3^,1,. 

groups are observed »ay be due to bias 

Kesidue 1. is also variable and fully exposed, 

. . T> T «? K. Qr and L. 

containing P, R/ ^' ^' ^' 

Residue 21 is not very variable, containing F or Y 

■ .r.f 33 cases and I and W in the remaining cases 
,n 31 Of 33 cases ^^^^^^^ 

The side group of Y21 fiUs f 

the .ain chain of -^^^ solvent. 
-iP Of the . Side ^-P^J^^^^^^^^ ^Ltituting . or P 
Clearly one can va:^ the surface J^^ hydrophobic or 

so that the surface possible that 

hydrophilic in that region. It is P 
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the other aromatic ainino acid (viz. H) or the other 
hydrophobics (L, M, or V) might be tolerated. 

Residue 27 most often contains A, but S, K, L, and 
5 T are also observed. On structural grounds, this 
residue will probably tolerate any hydrophilic ammo 
acid and perhaps any amino acid. 

Residue 28 is G in BPTI. This residue is in a 
turn, but is not in a conformation peculiar to glycine. 
Six other types of amino acids have been observed at 
this residue: K, N, Q, R, H, and N. Small side groups 
at this residue might not contact HHMb simultaneously 
with residues 17 and 34. Large side groups could 
interact with HHMb at the same time as residues 17 and 
34 Charged side groups at this residue could affect 
binding of HHMb on the surface defined by the other 
residues of the principal set. Any amino acid, except 
perhaps P, should be tolerated. 

Residue 29 is highly variable, most often, 
containing L. This fully exposed position will 
probably tolerate almost any amino acid except, 
perhaps, P. 

Residues 31, 32, and 34 are highly variable, 
exposed, and in extended conformations; any amino acid 
should be tolerated. 
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20 
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30 



Residues 48 and 49 are also highly variable and 
fully exposed, any amino acid should be tolerated. 

Residue 52 is in an alpha helix. Any amino acid, 
except perhaps P, might be tolerated. 
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ronsider possible variation of tiie 
How we consiaer p 

^ ,^ ^ i-^ 1 2^ of residues that are i« 
secondary set (Sec. 13.1.2) o Neighboring 

w o-F -the principal set. wexyi 

neighborhood of the P ^^^^ ^^^^^^ .^^^^^^ 

26 (K), 35 (Y), 47 (S), 50(D), and 53 (R) . 

<, is highly variable, extended, and 
Residue 9 is nignxy 

..^o.ad. Ke^iaue . "^^f J„ ^nrl n fro. 

- — ^r- r::: a"e :n: :::iauL ana « 

residue 31 to 34 either the 

" contribute "-"J-^^^^^ ^e „,,i„ fro. 31 

to 34 can £xt, « effectively reduce the 

,5 Must have aerivative. 
radius of curvature of the Bfii 

11 is highly variable, extenaea, and 
Eesiaue 11 IS "i^" ' - sliahtly far 

e^osed. Kesidue 11, li.« '^^^'X;,;;, ^IJs ana 
20 fro. the surface defined by ^^^f ^^^^^^^.^^^ances . 
„ill contribute to binding in the sa.e cir 

Residue 15 is highly varied. The "^^^^ f 
Eesiaue aefined by set 12. 

residue 15 points -y ^^^^^ ,,,,,, ,i.aing on 

25 Changes of charge at residue 

the surface defined by residue set #2. 

ifi is varied but points away from the 
Residue 16 is variea changes in 

defined by set #2. 

• T -!r, RPTI This residue is in an 

Residue IS IS I - ,,,e other a.lno 

extended confor.ation and is expo ^_ 
35 acids have been observed at this re 
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and T. only T is hydrophilic. The side <3Xon-p J9oir±s^ 



directly avay £ro. the surface defined 
,2 substitution of charged amino acxds at this 
residue eould affect binding at surface defined by 



15 



5 residue set #2. 

Residue 20 is R in BPTI. This residue is in an 
extended conforBation and is exposed. Pour other amxno 
acids have been observed at this residue: A, S L, and 
,0 Q. The side group points directly away from^e 

surface defined by residue set #2. //^^^^/^ 
charge at this residue could affect bxndxng at surface 
defined by residue set #2. 

Residue 22 is only slightly varied, being Y, F, or 
H in 30 of 33 cases. Nevertheless, A, N, and S have 
been observed at this residue. Aiuino acids such as L, 
„ I or Q could be tried here. Alterations at residue 
22 may affect the nobility of residue 21; changes xn 
Charge at residue 22 could affect bindxng at the 
surface defined by residue set #2. 

Residue 24 shows some variation, but probably can 
not interact with one molecule of the target 
simultaneously with all the residues in the prxncxpal 
set variation in charge at this residue mxght have an 
effect on binding at the surface defined by the 
principal set. 

Residue 26 is highly varied and exposed. Changes 
in Charge may affect binding at the surface <^-^i--^ 
residue set #2; substitutions may affect the mobility 
of residue 27 that is in the principal set. 

Residue 35 is most often Y, W has been observed. 
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The side group of 35 is .urled, but substitution of F 
or w could affect the mobility of residue 34. 

Residue 47 is always T or S in the sequence sample 
5 used. The 0,^, probably accepts a ^^^^ " 

the m Of reLdue 50 in the alpha hel.x. 
there is no overwhelming steric reason to preclude 
o^er amino acid types at this residue. In particular 
olr amino acids the side groups of which can acc^Pt 
K^r.^e VIZ N Q, and E, may be acceptable 
10 hydrogen bonds, viz^ r*, u, 

here. 

Residue 50 is often an acidic anino acid, but 
other amino acids are possible. 

Residue 53 is Often R, but other amino acids have 
.een observed at this residue. changes of charge may 
affect binding to the amino acids in interactron set 



#2 



20 



From published models (H0BE77, WIOD84) one can see 
that R39 is on the opposite side of BPTI from the 
^face defined by the residues in set ,2. Therefore 
variation at residue 39 at the same time as var.at.on 
3S If some residues in set .2 is much less likely to 
4rove binding that occurs along surface #2 than rs 
variation of the other residues in set #2. 

in addition to the twelve principal residues and 

4.v,<=.r.a are two Other resxdues, 
30 13 secondary residues, there are two o^ 

30, C, and 33 involved in surface ,2 that we wUl 
probably not vary, at least not until late .n the 
'procedure. These residues have their s.de groups 
buried inside BPTI and are conserved. changing these 
residues does not change the surface nearly so much as 



35 



PCr/US89/03731 
WO 90/02809 

162 

does changing residues in the principal set. These 
buried, conserved residues do, however, contribute to 
the surface area of surface #2. The surface of residue 
set #2 is comparable to the area of the tryps in-binding 
5 surface. Principal residues 17, 19, 21, 27, 28, 29, 
31 32, 34, 48, 49, and 52 have a combined solvent- 
accessible area of 946.9 fi^. secondary residues 9, 11, 
15, 16, 18, 20, 22, 24, 26, 35, 47, 50, and 53 have 
combined surface of 1041.7 g2. Residues 30 and 33 have 
10 exposed surface totaling 38.2 g2. Thus the three 
groups' combined surface is 2026.8 . 

Residue 30 is C in BPTI and is conserved in all 
homologous sequences. It should be noted, however, 
15 that C14/C38 is conserved in all natural sequences, yet 
Marks efe ai^ (MARK87) showed that changing both C14 and 
C38 to A,A or T,T yields a functional trypsin 
inhibitor. Thus it is possible that BPTI-like 
molecules will fold if C30 is replaced. 

Residue 33 is F in BPTI and in all homologous 
sequences. Visual inspection of the BPTI structure 
suggests that substitution of Y, M, H, or L might be 
tolerated. 

Given our hypothetical affinity separation 
sensitivity, Csensi. we decide to vary six residues 
leaving some margin for errors in the actual base 
composition of variegated bases. To obtain maximal 
30 recognition, we choose residues from the principal set 
that are as far apart as possible. Table 36 shows the 
distances between the beta carbons of residues in the 
principal and peripheral set. R17 and V34 are at one 
end of the principal surface. Residues A27, G28, L29, 
35 A48, E49, and M52 are at the other end, about twenty 
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Angstroms away; of these, we will vary residues 17, 27 
7, 34, and 48. Residues 28, 49, and 52 will be varxed 
at later roiinds. 

Of the remaining principal residues, 21 is left to 
later variations. Among residues 19, 31, and 32, we 
arbitrarily pick 19 to vary. 

unlimited variation of six residues produces 6.4 x 
107 amino acid sequences. By hypothesis, Cgensi 1 
in 4 X 108. Table 37 shows the programmed variegation 
at the Chosen residues. The parental sequence is 
present as 1 part in 5.5 x 10^, but the least favored 
sequences are present at only 1 part in ^ J^ ' 

Among single-amino-acid substitutions from the PPBD 
the least favored is P17-I19-A27-L29-V34-A48 and has a 
calculated abundance of 1 part in 1.6 x lo8. Usxng the 
optimal qfk codon, we can recover the parental se^ence 
aL all one-amino-acid substitutions to the PPBD if 
actual nt compositions come within 5% of prograx^aed 
compositions. The number of transf ormants is Mntv = 
X.o X 10^ (also by hypothesis), thus we will produce 
most of the programmed sequences. 

The residue numbers above refer to mature BPTI. 
Since Table 25 refers to the pre-M13CP-BPTI protein, 
all mature BPTI sequence numbers have been increased by 
the length of the signal sequence, 23. Thus, we wish 
to vary residues 40, 42, 50, 52, 57, and 71. A DNA 
subsequence containing all these codons is found 
hetween the (^D sites at base 191 and the SphI site 
at base 309 of the osp-pbd gene. Among Apal, Drall, 
and Pssi, Apal is preferred because it recognizes six 
bases without any ambiguity and will cut fewer 
sequences in the vgDNA. Gratuitous restriction sites 
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can be avoided in some cases by use of codon ambiguity: 
changing the codon for g51 from GGC to GGT makes it 
impossible to generate an Agal site at codons 50, 51, 
and 6=52. 

^ Each piece of dsDNA to be synthesized needs six to 

eight bases added at either end to allow cutting wxth 
restriction enzymes and is shown in Table 37. The 
first synthetic base (before cutting with Aeal and 
SEhl) is 184 and the last is 322. There are 142 bases 
to be synthesized. The center of the piece to the 
synthesized lies between Q54 and V57. The overlap can 
not include varied bases, so we choose bases 245 to 256 
as the overlap that is 12 bases long. Note that the 
codon for F56 has been changed to TTC to increase the 
GC content of the overlap. The amino acids that are 
being varied are marked as Z with a plus over them, 
codons 57 and 71 are synthesized on the sense (bottom) 
strand. The design calls for «qfk" in the antisense 
strand, so that the sense strand contains (from 5' to 
3.) a) equal part C and A (i^ the complement of k) , 
b) (0.40 T, 0.22 A, 0.22 C, and 0.16 G) (i^ the 
complement of f ) , and c) (0.26 T, 0.26 A, 0.30 C, and 
0.18 G) . 

Each residue that is encoded by "qfk" has 21 
possible outcomes, each of the amino acids plus stop. 
Table 12 gives the distribution of amino acids encoded 
by "qfk", assuming 5% errors. The abundance of the 
parental sequence is the product of the abundances of R 
X I X A X L X V X A. The abundance of the least- 
favored sequence is 1 in 4.2 x 10 . 

01ig#27 and olig#28 are annealed and extended with 
Klenow fragment and all four (nt}TPs. Both the ds 
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...... « ana K^^P- Z^^^^ 

and Spli I. The cut DNA pur^fx^^^ ^^^^ ^^^^^^^^ 
pieces ligated (See Sec 1 • ^^^^^ ^^^^^^^^ ^ 

competent PE383. (Sec, ^^'^^ ^^^^^ „ith 5.0 1 

sufficient number of transf orxaants , 

of cells, 

1 i in 5 0 1 of LB broth at 37°C 

1) culture coil in 5 0 1 ^ ^ 
until cell density reaches 5 x 
cells/ml/ 

• ^n-r 65 minutes, centrifuge the 

2) chill on ice for 65 min 

a-t- 4000g for 5 minutes 
cell suspension at 4uvjuy 

resuspend the cells in 

3) discard supernatant res P 

1667 ml of an ice-cold, stern 
BiM CaCl2f 

on ice for 15 minutes, and then 

4) chill on ice q 
centrifuge at «00g for 5 minutes 

. i„ 2 X 400 ml of loe-oold, 

5) resuspend cells in 2 x 40c for 24 
sterile 60 .« cacl2. -tore cells at 

hours , 

... « <ioo in - - - Utiga-: - 
TE buffer; nix, inculafe on ice 

distribute into 200 Ml ali^ots and l^eat 
shocic cells at 42-0 for 20 seconds, 



30 



35 



s, .aa 200 ^ .B brotn and incubate at 3.^^ for 

1 hour,- 

4-^ o n 1 of LB broth 
9) add the culture to 2.0 1 or 
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containing ampicillin at 35-100 ug/ial and 

culture overnight at 37°C, 

10) after 6 Hours, remove 200 ml and plate 0.5 
5 ml portions with log phase JM 107 on LB agar, 

using the soft-agar overlay technique. Phage 
are prepared from the soft agar, 

11) centrifuge the overnight culture to remove 
10 cells, and pellet phage (MESS83), 

12) harvest virions by method of Salivar, et 
al. (SALI64). 



It is important to: a) use all or nearly all the 
vgDNA synthesized in ligation, b) use all or nearly all 
the ligation mixture to transform cells, and c) culture 
all or nearly all the" transformants. These measures 
20 are directed at maintaining diversity. 

It "is important to collect virions in a vay that 
samples all or nearly all the transformants. Because 
F- cells are used in the transformation, multiple 
25 infections do not pose a problem in the overnight phage 
production. F« cells are used for phage production m 
agar. 

HHMb has a pi of 7.0 and we carry out 
30 chromatography at pH 8.0 so that HHMb is slightly 
negative while BPTI and most of its mutants are 
positive. HHMb is fixed (Sec. 15 1) to a 2.0 ml column 
on Affi-Gel 10 (TM) or Affi-Gel 15 (™) at 4.0 mg/ml 
support matrix, the same density that is optimal for a 
35 colTimn supporting trp. 
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TO remove variants of BPTI with strong, 
indiscriminate binding for any protein or for the 
support matrix (Sec. 15.2), we pass the variegated 
population of virions over a column that supports 
bovine serum albumin (BSA) before loading ^he 
population onto the (HHMb) column. Affi-G^l 10 (™) or 
Affi-Gel 15 (TM) is used to immobilize BSA at the 
highest level the matrix will support. A 10.0 ml 
column is loaded with 5.0 ml of Af f i-Gel-linked-BSA; 
this column, called {BSA}, has Vv = 5.0 ml. The 
variegated population of virions containing 10^^ pfu xn 
1 Bl (0.2 X Vv) of 10 mM KCl, 1 mM phosphate, pH 8.0 

, . J ^ ,Tac&\ w© wash {BSA} with 4.5 ml 
buffer is applied to {BSA}. We wasn i ; 

fO 9 X Vv) of 50 mM KCl, 1 mM phosphate, pH 8.0 buffer. 
The wash with 50 mM salt will elute virions that adhere 
slightly to BSA but not virions with strong binding. 
The pooled effluent of the {BSA} column is 5.5 ml of 
approximately 13 mM KCl. 

The column {HHMb} is first blocked by treatment 
with loll virions of M13(am429) in 100 ul of 10 mM KCl 
buffered to pH 8.0 with phosphate; the column is washed 
with the same buffer until OD26O ^^^urns to base line 
or 2 X Vv have passed through the column, whichever 
comes first. The pooled effluent from {BSA} is added 
to {HHMb} in 5.5 ml of 13 mM KCl, 1 mM phosphate, pH 
8.0 buffer. The column is eluted (Sec. 15.3) m the 
following way: 

1) 10 mM KCl buffered to pH 8.0 with phosphate, 
until optical density at 280nm falls to base line 
or 2 X Vv, whichever is first, (effluent 
discarded) , 
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•Table 2: Preferred Outer-Surface Proteins 



Genetic 
Package, 



M13 



PhiX174 



Preferred 
Outer-Surface 

Protein 



Po;.ctnn fo r pn^^-F*=>rence 



.iii 

coat protein a) exposed amino terminus. 
(gpVIII) b) predictable post- 

translational 
processing/ 
c) numerous copies in 
virion. 
^ ) -Fusion rln^- =>^r;.-i 1 able. 



G protein 



a) known to be on virion 
exterior, 

b) small enough that 
the G-ipbd gene can 



E\ coli 



LamB 



B. gnbtilis 
spores 



cote 



CotD 



a) fusion data available, 
non"* ^gg^Titial > 



a) no post-translational 
processing, 

b) distinctive sdequence 
that causes protein to 
localize in spore coat, 

r) non- '=**=g^Tiiiial. 



g^TTiA as fnr cote. 
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Table 7: Atomic radii 
Angstroms 



Calpha 1-^° 

Ocarbonyl -^'^^ 

Namide ^'^^ 

Other atoms 1.80 



Table 8 



Fraction of DNA molecules having 
n non-parental bases when 
reagents that have fraction 
M of parental nt. 



10 



15 



20 



fO .9000 .5000 
fl .09499 .35061 
f2 .00485 .1188 
f3 .00016 .0259 
f4 .000004 .00409 



.1000 .0100 .0010 .000001 

.2393 .04977 .00777 .0000175 

.2768 .1197 .0292 .000149 

.2061 .1854 .0705 .000812 

.1110 .2077 .1232 .003207 



f8 0- 

fl6 0. 

f23 0. 
Tnost 0 



2x10-7 .00096 .0336 .1182 .080165 
Q 5x10-7 .00006 .027281 



0. 

0 



0. 

2 



0. 



0. 

7 



.0000089 

1?- 



"most" is the value 
probability. 



of n having the highest 



25 
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Table 9: best vgCodon 



10 



15 



Program "Find Optimum vgCodon." 
INITIALIZE-MEMORY-OF-ABUNDANCES 
DO ( tl = 0.21 to 0.31 in steps of 0.01 ) 
. DO ( cl = 0.13 to 0.23 in steps of 0.01 ) 
DO ( al = 0.23 to 0.33 in steps of 0.01 ) 
comment calculate gl from other concentrations 
. . . gl = 1.0 - tl - cl - al 
. . . IF( gl .ge. 0.15 ) 

. . . . DO ( a2 = 0.37 to 0.50 in steps of 0.01 ) 
*. '. . . DO ( c2 = 0.12 to 0.20 in Steps of 0.01 ) 

Comment Force D+E = R + K 

g2 = (gl*a2 -.5*al*a2)/ (cl+0.5*al) 

comment' Calc t2 from other concentrations. 

t2 = 1. - a2 - c2 - g2 

IF(g2.gt. O.l.and. t2.gt.0.1) 
. . . CALCULATE-ABUNDANCES 

. COMPARE-ABUNDANCES-TO-PREVIOUS-ONES 

end_IF_bloclc 

/ end_DO_loop ! c2 

end_DO_loop I a2 

end_IF_block ! if gl big enough 

25 . . . .end_DO_loop 1 al 

. . . end_DO_loop 1 cl 
. . end_DO_loop i tl 

WRITE the best distribution and the abundances. 



20 
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10 



15 



Amino 

acid. 



A 

D 

F 

H 

K 
M 
P 
R 
T 

SL 
stop 



Table 10: 


Abvmdances 


obtained 


from 


optimum vgCodon 




Amino 






acid 


abundance 


4.80% 


C 


2.86% 


6.00% 


E 


6.00% 


2.86% 


G 


6.60% 


3.60% 


I 


2.86% 


5.20% 


L 


6.82% 


2.86% 


N 


5.20% 


2.88% 


Q 


3.60% 


6.82% 


s 


-7.07^ mfaa 


4.16% 


V 


6.60% 


Ifaa Y 


5.20% 



5.20% 



20 



ratio = Abun(W)/Abun(S) = 0.4074 



25 



30 



35 



a 

1 
2 
3 
4 
5 
6 
7 

Ifaa 
mfaa 



( yratio)^ 
2.454 
6.025 
14.788 
36.298 
89.095 
218.7 
536.8 



(ratio) 3 

.4074 

.1660 

.0676 

.0275 

.0112 
4.57 X 10~3 
1.86 X 10"^ 



least - favored amino-acid 
most - favored amino-acid 



sto p-free 
.9480 
.8987 
.8520 
.8077 
.7657 
,7258 
.6881 
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Table 11: Calculate worst codon. 



10 



15 



20 



P„,ra. "Find wc«t vgCodon within Ssrr of ,iven 

distribution." 
INITIALIZE-MEMORY-OF-ABUNDANCES 

comment Serr is % error level. 

^^""'^ . .n- Gli T2i,C2i,A2i,G2i, T3i,G3i 
comment Tli,Clx,Ali,Gli, T2i,c^ , 
comment are the intended nt-distrxbut.on. 

READ Tli, Cli, Alx, Gil 

READ T2i, C2i, A2i, G2i 

READ T3i, G3i 

Fdwn = l.-Serr 

Fup = l.+Serr 

DO ( tl - Tli.Fdvm t= Tli.Fup in 7 step.) 

no ( cl - cli.Fa»n to Cli*Fup in 7 steps) 
• Jo ( ax . Aii.Fa™ to AU*Fup in 7 steps, 
,1.1. - tl - ol - al 
IF( (gl-Sli)/cll .It. -serr) 
co-Bent gl too far belo» Gli. push it hacK 
gl = Gli*Fd^ 

factor = (l.-gl)/(tl + ^1 + 
tl = tl* factor 



25 



30 



35 



Comment 



cl = cl* factor 
. al - al* factor 
..end_IF_block 
IF( (gl-Gli)/Gli 



.gt. serr) 



gl too far above Gli, push it back 
. gl = Gli*Fup 

. factor = (l.-gl)/(tl 4- cl + al) 

tl - tl*factor 
cl = cl*factor 
al = al*factor 
..end_IF_block 
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Table 11, continued. 

. . . DO ( a2 = A2i*Fdwn to A2i*Fup in 7 steps) 
5 Table 11, continued, 

. . . . DO ( c2 = C2i*Fdwn to C2i*Fup in 7 steps) 

DO (g2=G2i*Fdwn to G2i*Fup in 7 steps) 

Comment Calc t2 from other concentrations. 
IQ t2 = 1. - a2 - c2 - g2 

IF( (t2-T2i)/T2i .It. -Serr) 

Comment t2 too far below T2i, push it back 

t2 = T2i*Fdwn 

factor = {l.-t2)/(a2 + c2 + g2) 

15 ....... a2 a2*f actor 

c2 = c2* factor 

g2 = g2* factor 

end_IF_bloc]c 

IFC (t2-T2i)/T2i .gt. Serr) 

20 Comment t2 too far above T2i, push it back 

. t2 = T2i*Fup 

. factor = (l.-t2)/(a2 + c2 + g2) 
Table 11, continued. 

. a2 = a2*f actor 
. c2 = c2* factor 
. g2 - g2* factor 
. . end_IF_block 

IF(g2.gt. 0.0 .and. t2.gt.0.0) 
. t3 = 0.5* (1. -Serr) 
. g3 = 1. - t3 
. CALCDIATE-ABXJNDANCES 

. COMPARE-ABUNDANCES-TO-PREVIOUS-ONES 

. t3 = 0.5 
. g3 = 1. - t3 



25 



30 



35 



wo 90/02809 



10 



15 



PCr/US89/03731 

211 

Table 11, continued. 

CALCULATE-ABUNDANCES 
; ! COMPARE-ABUNDANCES-TO-PREVIOUS-ONES 

. . . t3 = 0.5*(l.+Serr) 
, . . g3 = 1. - t3 

CALCULATE-ABUNDANCES 

Table 11, continued. 

COMPARE-ABUNDANCES-TO-PREVIOUS-ONES 

. . ..end_IF_block 

. ..end_DO_loop I g2 

. .end_DO_loop 1 c2 

.end_DO_loop 1 a2 

.end_DO_loop I al 

.end_DO_loop I cl 

end DO loop ! tl 
«KITE'th; WOHST dUtributlon and the abundances. 
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Table 12: Abiindances obtained 
using optimum vgCodon assuming 
5% errors 



Amino 
acid 


Abundance 


Amino 
acid 


AVmndance 


A 
D 
P 


4.59% 
5.45% 

7.49% Ifaa 


C 
E 
G 


2.76% 

6.02% 

6.63% 

2.71% 

6.71% 

5 . 19% 

3.97% 

7.01% 

6.00% 

4.77% 


H 
K 
M 
P 
R 


3.59% 
5.73% 
3.00% 
3.02% 

7.68% mfaa 


I 
L 
N 
Q 
S 


T 
W 

stOD 


4.37% 
3.05% 
5.27% 


V ■ 
Y 



ratio = Abun(F)/Abun(R) = 0.3248 



1 3-079 -3248 

9.481 .1055 -8973 

3 29.193 .03425 - 

4 89.888 •°^H^-3 1627 



6 85?-?2 1.17 X 10 ^^^^^ 



7 



2624.1 3.81 X 10- 
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Table 13: BPTI Homologues 



R # 
-3 
-2 
-1 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 



R 
P 
D 
F 
C 
L 
E 
P 
P 
Y 
T 
G 
P 
C 
K 
A 
R 
I 
I 
R 
Y 
F 
Y 



R 
P 
D 
F 
C 
L 
E 
P 
P 
Y 
T 
G 
P 
T 
K 
A 
R 
I 
I 
R 
Y 
F 
Y 



N N 
A A 



K 
A 
G 
L 
C 
Q 
T 
F 
V 
Y 
G 
G 
C 
R 
A 



R 
P 
D 
F 
C 
L 
E 
P 
P 
Y 
T 
G 
P 
A 



K 
A 
G 
L 
C 
Q 
T 
F 
V 
Y 
G 
G 
T 
R 
A 



R 
I 
I 
R 
Y 
F 
Y 



4 

F 

Q 
T 
P 
P 
D 
L 
C 

Q 
L 
P 

Q 
A 
R 
G 
P 
C 



5 6 7 8 9 10 11 12 13 14 15 



K K 
A A 



K K 
R R 
N N 



K 
A 
G 
L 
C 

Q 
T 
F 
V 
Y 
G 
G 
A 
R 
A 
K 
R 
N 



A 
L 
L 
R 
Y 
F 
Y 



N N 
A S 



T 

E 

R 

P 

D 

F 

C 

L 

E 

P 

P 

Y 

T 

G 

P 

C 

K 

A 

A 

M 

I 

R 

Y 

F 

Y 



T 
S 
N 
A 
C 
E 
P 
F 
T 
Y 
G 
G 
C 
Q 
G 
N 
N 
N 



G 
F 
C 
E 
T 
F 
V 
Y 
G 
G 
C 
R 
A 



R 
P 
D 
F 
C 
L 
E 
P 
P 
Y 
T 
G 
P 
C 
V 
A 
R 
I 
I 
R 
Y 
F 
Y 



N N 

A A 

K K 

A A 



R 

P 

D 

F 

C 

L 

E 

P 

P 

Y 

T 

G 

P 

C 

G 

A 

R 

l' 

I 

R 

Y 

F 

Y 



G 
L 
C 

Q 
T 
F 
V 
Y 
G 
G 
C 
R 
A 



R 
P 
D 
F 
C 
L 
E 
P 
P 
Y 
T 
G 
P 
C 
A 
A 
R 
I 
I 
R 
Y 
F 
Y 



N N 

A A 

K K 

A A 



K K 
S R 
N N 



G 
L 
C 

Q 
T 
F 
V 
Y 
G 
G 
C 
R 
A 
K 



G 
L 
C 

Q 
T 
F 
V 
Y 
G 
G 
C 
R 
A 
K 



R 
P 
D 
F 
C 
L 
E 
P 
P 
Y 
T 
G 
P 
C 
L 
A 
R 
I 
I 
R 
Y 
F 
Y 



R 
P 
D 
F 
C 
L 
E 
P 
P 
Y 
T 
G 
P 
C 
I 
A 
R 
I 
I 
R 
Y 
F 
Y 



N N 
A A 



K 
A 
G 
L 
C 
Q 
T 
F 
V 
Y 
G 
G 
C 
R 
A 



R 
P 
D 
F 
C 
L 
E 
P 
P 
Y 
T 
G 
P 
C 
K 
A 
R 
I 
I 
R 
Y 
F 
Y 



K 
A 
G 
L 
C 
Q 
T 
F 
V 
Y 
G 
G 
C 
R 
A 



Q 
P 
L 
R 



16 17 
- Z 
H G 



18 19 



K 
A 
G 
L 
C 
Q 
T 
F 
V 
Y 
G 
G 
C 
R 
A 



C 
I 
L 
H 
R 
N 
P 
G 
R 
C 
Y 
Q 



P 
A 
F 
Y 
Y 



N N 
A Q 



K 
K 
K 
Q 
C 
E 
G 
F 
T 
W 
S 
G 
C 
G 
G 



A 
A 



K K 
L Y 



C 
K 
L 
P 
L 
R 
I 
G 
P 
C 
K 
R 



K K 
I I 



R R 
N N 



K K 
R R 



K N 
R S 



N N N N 



P 
S 
F 
Y 
Y 
K 
W 
K 
A 
K 

Q 
C 
L 
P 
F 
D 
Y 
S 
G 
C 
G 
G 
N 
A 
N 



R 
P 
D 
F 
C 
E 
L 
P 
A 
E 
T 
G 
L 
C 
K 
A 
Y 
I 
R 
S 
F 



N 
L 
A 
A 
Q 
Q 
C 
L 

Q 
F 
I 
Y 
G 
G 
C 
6 
G 
N 
A 



R 

P 

R 

F 

C 

E 

L' 

P 

A 

E 

T 

G 

L 

C 

K 

A 

R 

I 

R 

S 

F 



H H 
Y Y 



N 
R 
A 
A 
Q 
Q 
C 
L 
E 
F 
I 
Y 
G 
G 
C 
G 
G 
N 
A 



D 

R 

P 

T 

F 

C 

N 

L 

P 

P 

E 

S 

G 

R 

C 

R 

G 

H 

I 

R 

R 

I 

Y 

Y 

N 

L 

E 

S 

N 



N N 



K 

V 
F 
F 
Y 
G 
G 
C 
G 
G 
N 
A 
N 



D 

K 

R 

D 

I 

C 

R 

L 

P 

P 

E 

Q 

G 

P 

C 

K 

G 

R 

L 

P 

R 

Y 

F 

Y 

N 

P 

A 

S 

R 



K M 
C C 



Z 
G 
R 
P 
S 
F 
C 
N 
L 
P 
A 
E 
T 
G 
P 
C 



E 
S 
F 
I 
Y 
G 
G 
C 
K 
G 
N 
K 
N 



S 
K 
S 
G 
G 
C 
Q 
Q 
F 
I 
Y 
G 
G 
C 
R 
G 



A 
A 
K 
Y 
C 
K 
L 
P 
V 
R 
Y 
G 
P 
C 



K K 
A K 



S 
I 
R 

Q 
Y 
Y 
Y 

N N 



K 
F 
P 
S 
F 
Y 
Y 



W 
K 
A 
K 
Q 
C 
L 
P 
F 
N 
Y 
S 
G 
C 
G 
G 



N N 
Q A 
N N 



- R 

- P 
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Table 13, continued. 

R # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 

44 NNNNNNNNNNNRRRRNNRR 

45 FFFFFFFFFFFFFFFFFFF 
tl KKKEKKKKKKKKKKKEKDK 

47 SSSTSSSSSSSTTTTTTTT 

48 AAATAAAAAAAIIIIRKTI 

49 EEEEEEEEEEE.EEDD DAQE 

50 DDDM DDDDDDDEEEEEEQE 

51 CCCCCCCCCCCCCCCCCCC 

52 MMMLMMMMMMERRRHRVQR 

53 RRRRRRRRRR^^^SSS??rS 

54 TTTITTTTTTTTTTTTAVT 

55 CCCCCCCCCCCCCCCCCCC 

56 GGGEGGGGGGGIVVVGRVV 

57 GGGP GGGGGGGRGGGGP-G 

58 AAAPAAAAAAAK---KP-- 

59 - -- Q- -- -- -- -- -- --E-- 

60 _--Q---------_ 

61 - -- T- -- -- -- - - 

62 «--D--------"" 

63 __-K---------~"~"~" 

64 - -- S- -- -- -- "~~""~~ 

R # = residue number 

1 BPTI 

2 Engineered BPTI From MARK87 

3 Engineered BPTI From MARK87 

4 Bovine Colostrum (DUFT85) 

5 Bovine Serum (DDFT85) 

6 Semisynthetic BPTI, TSCH87 

7 Semisynthetic BPTI, TSCH87 

8 Semisynthetic BPTI, TSCH87 

9 Semisynthetic BPTI, TSCH87 

10 Semisynthetic BPTI, TSCH87 

11 Engineered BPTI, AUER87 

12 Dendroaspis polvle pis polvleois (Black mamba) venom I 

(DUFT85) , ^ 

13 nendroaspis polvlep -is nnlvleois (Black Mamba) venom K 

(DUFT85) 

14 Hemachatus hemachates (Ringhals Cobra) HHV II 
(DUFT85) 

15 Na-ia nivea (Cape cobra) NNV II (DUFT85) 

16 Vipera russelli (Russel's viper) RW II (TAKA74) 

17 Red sea turtle egg white (DUFT85) 

18 Snail mucus ( Helix pomania ) (WAGN78) 

19 Dendroaspis anausticeps (Eastern green mamba) 
C13 SI C3 toxin (DUFT85) 
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Table 13, continued. 
R t 20 21 22 23 24 25 26 27 28 29 30 31 32 33 

P-QDDN---QK Ri 



-i I ; i 5 R R ^ K T 5 R H G D 
^ RPRPPPNEVHHPFL 

KYTKKT GD 
LAF FFFDS 

cccccccc 

I i ^ ^ i ^ L ' S ■£ 5 i s = « 



8 «^!!?pl?VPPPPFG 
? ^ I E D D E V I ? D D Y V D 
; PAPPPTVARKTTTA 
i rrGGGGGGGGKGGG 
\ R^PRRRPPPNIPPL 
■\ ? C C C C C C C C C C C C C 
5 YMKKLNRMR--KRF 
nT-AAAAAGAGQAAG 
K F t H Y L R M F. P T K G Y 



fe I I I X M ^ F T I- V V M F M 



J ^ P P P P S Q R R I K K 

- - A ; 

F 
Y 
Y 
N 

' S 
. H 
1 L 
[ H 

: K 
: c 

! Q 

33 F F F F F ? F F F F F F F F 



;iaPRARRLAARRL 
?? FFF??FYYWFFYYY 

21 F-F.Jyyyyfayyfns 
yyyyyyfy 

NDNNNNDD 
WSPSSGAT 
AAAHSTVR 
ASSLSSKI 
KNNHKMGB 
KKKKRAKl 

32 RPLKKKKTLAQTPb 



l\ N S N S N N N N D D K N N N 

25 Q?!!I5|tVrIkRE 

27 K A A S t L i S I L A A T T 

2I KNKNNHKMGKKGKK 

?! ^^t^TTTfKRAKTRFQN 



s ? ? ? ? ? ^ ? V ° ? V ? i 

^1 ^ 1 G G I G G G G R G G G G 



? S ? ? ? c ? ? 0 S ? ? ? c 

3I GRK§RGGMQDDKKQ 



GGGGGGGGGGAGG 
H N H H N N 
^3 ^ S M S S B S N » G G B N H 



'5 » H » ? ? i S ^ S S I ^G S 
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Table 13, continued. 
T? # 20 21 22 23 24 25 26 27 28 29 30 31 32 33 

1 I ? ? ? ? ? ? ^ ? ? ? ? 1 

TTTTTSTS S 



AT rp nt T T T T T T s> A " „ r 

;tiwwileeedael 
49 eeedddekkth^w^ 



50 
51 



I I S ^ E S i E = L L D D . 
I ^ ^ I 5 Q E L i I i S L E 

cccccccc 



54 T T A T T T 

55 C C C C C C 

56 I V V G V A 

57 G V G A A A 

58 - - - S S ~ 

59 - - 

60 - - 



GRGLEGSI 
V-VVLGGN 
KR-P YYAF- 



AGYS-GPR 
I G - - D 



20 r-r— r-'" ^^rn^^^^o^vs jtern Green 
Mamba) C13 S2 C3 toxxn (DV^T^^) 

21 no^riT-n^spis T-^-'y-'^r^'' pnlvlepes (BlacJc 
mamba) B toxin (DUFT85) /Riack 

22 nor,HT-n;,spis r-->y^r^" T^olvlepes (Black 
Mamba) E toxin (DUFT85) 

23 Tr^ p»-r^ ;,TmT.odvtes TI tOXin (DUFT85) 

24 ^iEilSifei CTI toxin (DUFT85 

25 r^ ^,^,. ^..ciitus VIII B toxin (DUFT85) 

26 ^r,^,n.^ni« sulcata (sea anemone) 5 II 

27 HnlZrslpiens HI-14 "inactive" domain 

28 S^slliens HI-14 "active" domain 

(DUFT85) . xnuFTaS^ 

29 beta bungarotoxm Bl (DUrT85) 

30 beta bungarotoxin B2 (DUFT85) 

31 Bovine spleen TI II (FIOR85) 

32 T^nh ypleus ^•videntatus (Horseshoe crab) 
hemocyte inhibitor (NAKA87) 

33 Poi^bv^ mori (silkworm) SCI-IH (SASA84) 

afboth beta bungarotoxins have residue 15 deleted, 
b B. mori has an extra residue between C5 and C14, we 

have assigned F and G to residue 9. 
c) all natural proteins have C at 5, 14, 30, 38, 50, & do. 
d^ all homologues have F33 and G37. ^„^4.4„o 
e) Ixtra C's in bungarotoxins form interchain cystine 

bridges 



Notes 
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Table 14: Tally of lonizable Groups. 
BPTI homologues. 



sequence nT?KRYHNHC02 + # 

Identifier ? ^ ^ J 4 0 1 1 ^ 

^ 2 2 4 6 4 0 1 1 6 16 

3 2 ' 4 6 4 0 1 1 4 

\ \ 4 \ 4 \ i 1 I 

2 2 3 6 4 0 1 1 I ^1 

' \\\ 6 1 2 i \ I II 

^ 2 2 3 6 4 0 1 1 5 15 

^ 2 2 3 6 4 0 1 1 5 15 

^° 2 2 3^^^^ 5 19 

14 1 4 2 7 2 2 1 .1 4 ^1 

2 5 3 7 3 2 1 1 3 19 

i? 2 4 6 7 3 0 1 1 7 21 

18 i i o 1 4 0 1 1 11 



27 
28 



17 



0^ 6 3 3 2 1 1 7 13 



^5 :*;;■:; i g 

23 ; i 4 6 5 1 1 1 5 17 



3246 5 J--^-^ cn 

2* 1 2 5 3 3 1 1 1 5 13 

25 i 5 4 4 4 1 1 1 2 16 

2^ i 4 2 2 4 0 1 1 -1 IJ 

2 3 4 3 3 0 1 1 2 14 



6 2 5 7 4 2 1 1 4 
6 2 6 7 4 
^? 2 3 5 4 4 



6 2 6 7 4 2 1 1 5^ 

2 3 5 4 4 0 1 1 4 
T3554011 

32 1 ? 3 1 4 0 1 1 -7 

sequences given in Table 10. 

of K ^ R NH - D - E - C02, approximate charge cn 
+ is sum of K + K ^ 
molecule at pH 7.0 ^ . < 

. K R + NH + D + E + C02, i^ number of xonxzed 
# is sum of K + R + ^ ^ 
groups at pH 
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Res. # 
-5 
-4 



Table 15: Aiaino acids observed at each Residue 
BFTI homologues 

Niimber 

Different BPTI 

AAs Contents _ 

2 D -32 

2 E -32 

o 5 T P F Z -29 

Zl 10 Z3 R3 Q2 T2 H G L K E -18 

; 10 D4 T2 P2 Q2 E G N K R -18 

lo R21 A2 K2 H2 P L I T G D R 

^ 9 P20 R4 A2 H2 N E V F L ^ 

; 10 D15 K6 T3 R2 P2 S Y G A L ^ 

^ 7 F19 D4 L3 Y2 12 A2 S ^ 

^ li E5 N4 K3 Q2 12 ¥2 D2 T R L 

7 5 LIB Ell K2 S Q p 

' 7 P26 H2 A2 I L G F 

Q 9 P17 A6 V3 R2 Q L K Y F ^ 

10 Yll E7 D4 A2 N2 R2 V2 S I D ^ 

^° lo T17 P5 A3 R2 I S Q Y V K J 

T 9 2 G32 K p 

5 P22 R6 L3 N I ^ 

lA 3 C31 TA „^,TMT? K 

It 12 K15 R4 Y2 M2 L2 -2 V G A I N F K 

- ll ^f2Sf2?3\5LF2LHTGP K 

18 6 121 M4 F3 L2 V2 T J 

tZ 7 111 PIO R6 S2 K2 L Q t. 

tl 5 R19 A7 S4 L2 Q ^ 

4 Y18 F13 W I p 

22 6 F14 Y14 H2 A N S ^ 

23 2 Y32 F N 
94 4 N26 K3 D3 S ^ ^ -d A 
25 10 S5 Q3 P3 W3 L2 T2 K G R A 

9 K16 A6 T2 E2 S2 R2 G H V K 

„ 5 A18 S8 K3 L2 T2 ^ 

28 7 G13 KIO N5 Q2 R H M ^ 

2I 10 L9 Q7 K7 A2 F2 R2 M G T N L 

^? 7 Q12 Ell L4 K2 V2 Y N Q 

II li il2 P5 K4 Q3 E2 L2 G V S R A T 

^4 li Vll 18 T3 D2 N2 Q2 F H P R K V 

35 2 Y31 W2 g 

36 3 G27 S5 R ^ 

37 1 , C 
3 C31 T A 



38 3 uox _ „ ^ R 

39 



7 R13 G9 K4 Q3 D2 P M 
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Table 15: continued. 





Number 




Different 


Res • t 


AAs 


40 


2 


41 


3 


42 


9 


43 


2 


44 


3 


45 


2 


45 


8 


47 


2 


48 


9 


A Ck 

49 


7 


50 


6 


51 


1 


52 


7 


53 


8 


54 


7 


55 


1 


56 


8 


57 


8 


58 


8 


59 


9 


60 


6 


■ 61 


3 


62 


2 


63 


2 


64 


2 



D H V Y R 



K D 



Contents 
G22 All 

N20 Kll D2 ^ « XT M 

All R9 S4 G3 H2 D Q K N 

N31 G2 
N21 Rll K 
F32 Y 
K24 E2 S2 
T19 S14 

All 19 E4 T2 W2 L2 R 
E19 D6 A2 Q2 K2 T H 
E16 D12 L2 M Q K 
C33 

R13 MIO L3 E3 Q2 H V 
R21 Q3 E2 H2 C2 G K D 
T23 A3 V2 E2 I Y K 
C33 

G15 V8 13 E2 R2 A L S 
G19 V4 A3 P2 -2 
All -10 P3 K3 
-24 G2 Q E A Y S 

-28 Q R I G D 
-31 T P 
-32 D 
-32 K 
-32 S 



R L N 
S2 Y2 R 
P R 



A 

K 

R 

N 

N 

F 

K 

S 

A 
E 
D 
C 
M 
R 
T 
C 
G 
G 
A 
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Table 16: Exposure in BPTI 



coordinates taken from ^^^^ ^vTT 

Broolchaven Protein Data Bank entry 6PTI. 



HEADER 
COMPND 
COMPND 
AUTHOR 



PROTEINASE INHIBITOR (TRYPSIN) 13-MAY-87 
BOVINE PANCREATIC TRYPSIN INHIBITOR 
2 (/BPTI$, CRYSTAL FORM /III$) 
A.WLODAWER 



6PTI 



solvent radius = ^-^^ ^ 
Atomic radii given m Table 7 

Areas in Angstroms-squared- 

Not Not 
Total Covered covered 
Residue area by M/C fraction at all^fractxon 



ARG 
PRO 
ASP 
PHE 
CYS 
LEU 
GLU 
PRO 
PRO 



1 
2 
3 
4 
5 
6 
7 
8 
9 



TYR 10 
THR 11 
GLY 12 
PRO 13 
CYS 14 
LYS 15 
ALA 16 
ARG 17 
ILE 18 
ILE 19 
ARG 20 
TYR 21 
PHE 22 
TYR 23 
ASN 24 
ALA 25 
LYS 26 
ALA 27 
GLY 28 
LEU 29 
CYS 30 
GLN 31 
THR 32 



342.45 

239.12 

272*39 

311.33 

241.06 

280.98 

291.39 

236.12 

236.09 

330.97 

249.20 

184.21 

240.07 

237.10 

310.77 

209.41 

351.09 

277.10 

278.03 

339.11 

333*60 

306.08 

338.66 

264.88 

211.15 

313.29 

210.66 

186.83 

280.70 

238.15 

301.15 

251.26 



205.09 
92.65 
158.77 
137.82 
48.36 
151.45 
128.91 
128.71 
109.82 
153.63 
80.10 
56.75 
130.25 
75.55 
200.25 
66.63 
243.67 
100.51 
146.06 
144.65 
102.24 
70.64 
77.05 
99.03 
85.13 
216.14 
96.05 
71.52 
132.42 
57.27 
141.80 
138.17 



0.5989 

0.3875 

0.5829 

0.4427 

0.2006 

0.5390 

0.4424 

0.5451 

0.4652 

0.4642 

0.3214 

0.3081 

0,5426 

0.3186 

0.6444 

0.3182 

0.6940 

0.3627 

0.5254 

0.4266 

0.3065 

0.2308 

0.2275 

0.3739 

0.4032 

0.6899 

0.4560 

0.3828 

0.4718 

0.2405 

0.4709 

0.5499 



152.49 
47.56 
143.23 
43.21 
0.23 
115.87 
90.39 
99.98 
45.80 
79.49 
64.99 
23.05 
75.27 
53.52 
192.00 
45.59 
201.48 
58.95 
96.05 
43.81 
69.67 
23.01 
17.34 
38.69 
48.20 
202.84 
54.78 
32.09 
93.61 
19.33 
82.64 
76 . 47 



0.4453 
0.1989 
0.5258 
0.1388 
0.0010 
0.4124 
0.3102 
0.4234 
0.1940 
0.2402 
0.2608 
0.1252 
0.3136 
0.2257 
0.6178 
0.2177 
0.5739 
0.2127 
.0.3455 
0.1292 
0.2089 
0.0752 
0.0512 
0.1461 
0.2283 
0.6474 
0.2601 
0.1718 
0.3335 
0.0812 
0.2744 
0.3043 
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PHE 33 
VAL 34 
TYR 35 
GLY 36 
GLY 37 
CYS 38 
ARG 39 
ALA 40 
LYS 41 
ARG 42 
ASN 43 
ASN 44 
PHE 45 
LYS 46 
SER 47 
ALA 48 
GLU 49 
ASP 50 
CYS 51 
MET 52 
ARG 53 
THR 54 
CYS 55 
GLY 56 
GLY 57 
ALA 58 



304.27 
251.56 
332.64 
187.06 
185.28 
234.56 
417.13 
209.53 
314.60 
349.06 
266.47 
269.65 
313.22 
309.83 
224.78 
211.01 
286.62 
299.53 
238.68 
293.05 
356.20 
251.53 
240.40 
184.66 
106.58 



Table 16, continued. 



59 
109 
80 
11 
84 
73. 
304. 

94. 
166. 
232. 
38. 
91. 
69. 
217, 
69, 
82 
161 
156 
24 
89 
224 
116 
69 
60 
49 



79 
78 
52 
90 
26 
64 
.62 
.01 
.23 
.83 
.53 
.08 
.73 
.18 
.11 
.06 
.00 
.42 
.51 
.48 
61 
43 
95 
79 
71 



0 
0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 
0 
0 
0 
0 
0 
0 
0 
0 



.1965 
.4364 
.2421 
.0636 
.4548 
.3139 
,7303 
.4487 
.5284 
.6670 
.1446 
.3378 
.2226 
.7010 
.3075 
.3889 
.5617 
.5222 
.1027 
.3054 
.6306 
.4629 
.2910 
.3292 
.4664 



18.91 
42.36 
15.05 
1.97 
39.17 
26.40 
250.73 
52.95 
108.77 
179.59 
5.32 
23.39 
14.79 
155.73 
24.80 
31.07 
100.01 
95.96 
0.00 
66.70 
189.75 
51.64 
0.00 
32.78 
38.28 



0 
0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 
0 
0 
0 
0 
0 
0 
0 
0 



irprsitlSn given m Protein Data 



0622 
1684 
.0452 
.0105 
.2114 
.1125 
.6011 
.2527 
.3457 
.5145 
.0200 
.0867 
.0472 
.5026 
.1103 
.1473 
.3489 
.3204 
.0000 
.2276 
.5327 
.2053 
.0000 
.1775 
.3592 
Bank 



"Total area" 



"Not covered 
by M/C" 



"Not covered 
at all" 



is the area measured by 

at Sdius 1.4 A, where J^^^f °^?his 

within the residue are considered. This 
takes account of conformation. 

is the area measured by a rolling fP^ere 
nl radiSs 1.4 A where all main-chain atoms 

Sfellj «Si%2! 
by side group atoms. 

is the area measured by a rolling sphere 
ol raSiSs 1.4 A where all atoms of the 
protein are considered. 
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Table 17: Plasmids used in Detailed Example 



Phage 
LGl 

pLG2 

pLGS 
pLG4 



pLG5 

pLG5 

pLG7 

pLG8 

pr.G9 

pLGlO 

pLGll 



Contents 

M13mpl8 with Ava II/Aafe II/AcS I/Esr 
Il/Sau I adaptor 

LGl with ampR and ColEl of pBR322 cloned 

into Mfe II/Acc I sites 

pLG2 with Acc I site removed 

pLG3 with first part of osp-pbd gene 

cloned into Rsr Il/gau I sites, 

Avr II/Asu II sites created 

pLG4 with second part of osp-pbd gene 

cloned into ME Tl/hsn II sites, BssH I 

site created 

pLG5 with third part of osp-pbd gene 
cloned into Asu lI/ BssH I sites, Bbe I 
site created 

pliG6 with last part of osp-pbd gene 
cloned into Bbe I / Asu II sites 
pLG7 with disabled osp-pbd gene, same 
length DNA. 

pLG7 mutated to display BPTI<V15bpti) 
pLG8 + tet^ gene - amp^ gene 



pIjG9 + tet^ gene - amp^ gene 



* 
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Table 25: Annotated Sequence of iebd gene 
5 ' - C 1 GGA I CCG I TAT | CCA | GGC | TTT \ ACA | CTT | TAT 1 

28 

I RsrII I .1 -35 I 



|GCT|TCC|GGC|TCG|TAT|AAT|GTG|TGG| 52 

I -10 I 



|AAT|TGTlGAG|CGGlATA|ACAlATT| 73 
I lac operator L 



|CCT|AGGlAGG|CTC|ACT| 
I Avr II I 

I S. D. 1 



88 



|m|kllc|s|l|v|l|k|a|si 
I 1 I 2 I 3 I 4 1 5 1 6 1 7 I 8 I 9 1 10| 
I ATG I AAG I AAA I TCT I CTG I GTT 1 CTT I AAG I GCT I AGC 1 

I afi TT| Nhe I I 



jv|a|v|a|t|l|v|plm|l| 
I 111 12) 131 14| 15| 161 171 181 19 1 20| 
I GTT I GCT I GTC 1 GCG 1 ACC 1 CTG 1 GTA 1 CCG 1 ATG 1 CTG | 
I Nru I I 1 Kpn I I 

|slfla|rlpldlflcll|e| 
I 211 22| 23| 24| 25l 26] 27 | 28l 29] 30j 
I TCT 1 TTT I GCT | CGT | CCG 1 GAT ] TTC | TGT 1 CTC 1 GAG \ 
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Table 25, continued. 

|AccIIl| I Ava I I 

I Xho I I 



lp|p|y|t|gip|c|k|a|r| 

1 31| 32| 331 34| 35| 36l 37 | 38| 39] 40| 
j CCG 1 CCA I TAT I ACT I GGG j CCC I TGC I AAA I GCG 1 CGC I 
I Pf IM I I iBssIf II I 

1 Apa I I 

I Dra II I 

I Pss I I 



|i|ilrlylf|y|nla|k| 

I 41| 42| 43| 44| 45] 46] 47 | 48i 49 | 
I ATC I ATC I CGT I TAT I TTC I TAG I AAC 1 GCT 1 AAA I 
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Table 25, continued. 

I a I g I 1 1 c I q 1 t I f I V j y 1 g 1 g 1 

I 501 51| 52| 531 '54| 55\ 56| 57\ 58 | 59 | 60| 
|GCAlGGC|CTG|TGC|CAGlACC|TTTlGTAlTAClGGT|GGTl 

I Stu II \ Acol \ 



lc|r|a|k|r|nln!f|k| 

I 61| 62| 63| 641 65l 66| 67 1 68 1 69 | 
|TGClCGTlGCTlAAGlCGT|AAClAAClTTTlAAA| 

I ESP I L 



|s|aleld|c|mlrlt|clgl 
I 701 711 72| 73| 741- 75| 76l 77 1 78l 79l 
I TCG 1 GCC 1 GAA | GAT 1 TGC 1 ATG 1 CGT 1 ACC 1 TGC 1 GGT 1 
|XmaIIll .1 Sph 1 1. 



1 g I a 1 a I e 1 g I d I d I 
I 801 811 82| 83l 84l 85l 86| 
1 GGC 1 GCC I GCT 1 GAA 1 GGT | GAT 1 GAT j 
I Bbe I I 
I Nar I I 



j p 1 a 1 k 1 a I a I 

I 871 88l 89| 90| 9ll 
I CCG I GCC 1 AAA I GCG I GCC I 

I Sfil ! 



361 
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flii|slllq|a|sla|tl 
Table 25, continued, 
i 921 931 941 951 5^1 97| 98 | 99ll00l 
1 TTT 1 AACl TCT I CTG] CAAI GCT 1 TCT j GCT 1 ACC 1 388 

I Hind 3 I 



|elylilg|yla|w| 
[ 101 1 102 1 103 1 104 1 105 1 106 1 107 ] 

1 GAA I TAT I ATC | GGT 1 TAG ] GCG j TGG | ^ 0 9 

I Mlu II 



I a I m I V 1 V 1 V I 
1 108 f 109 1 110 1 ml 1121 

iGCClATGlGTGlGTGlGTTl ^^4 

I BstX I I 

I Nco I I 
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Table 25, continued. 

,ilv|glaltli|glil 
1 113 1 114 1 115 1 116 1 117 1 118 1 119 1 120 1 
I ATC I GTT 1 GGT | GCT \ ACC 1 ATC | GGT | ATC \ 



I icl 1 I f I k 1 M f I t I s I k a 
1 121 1 122 1 123 1 124 1 125 1 126 1 127 1 128 1 129 1 130 
|AAA|CTGlTTTlAAGlAAA|TTT|ACT|TCG|AAA|GCGl 

|Asu III 



478 



1 131 1 132 1133 1 134 I 

I TCT I TAA 1 TAG 1 TGA [ GGT | TAG | CAG 1 TCT 1 
I BstE III 



1 AAG 1 CCC I GCC 1 TAAl TGA j GCG [ GGC | TTT | TTT 1 TTT | 
I Trp -Hf^-rminator,,. . ^ 1 



532 



. |cct|gag1g -3' 

i ga" I I 



539 



Note the following enzyme equivalences, 

Xma III = Sas I 

ACC III = BspM II 

Dra II = T^coO109 I 

Asu II = BstS I 

Sau I = BSU36 I 
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Table 27: DNA_synthl 
5/ I nrr. | Tr.r. \ STC I C-r.^ \ rrr, | tAT I CC1\ I CffC | TTT | ACAj CTT | TATj 
I nrT I TCC I r:c;r- 1 tcg | TA '^ | ^ fi^^ | <^-tg | TGG I 

|aaT |TGT|GAG|C ^|^TA|ACA|ATTl 
olig#4 = 3'- gt taa 



[r.CT|AGG| 
gga tec 

/ 3' = olig#3 
I GCC I GCT |- CCT I TCG I A AA j GCG \ 
egg cga gga age ttt cgc 

I TCT 1 TAA I TAG | TGA | GGT \ TAG 1 GAG ] TOT | 
aga att ate act cca atg gtc aga 

1 AAG I CCC I GCC ! TAA | TGA | GCG | GGC \ TTT | TTT 1 TTT | 
ttc ggg egg att act cgc ceg aaa aaa aaa 

1 CCT I GAG| GCAl GGTl GAG| CG 
gga etc egt cea etc ge - 5' 
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Table 27, continued. 



••Top" strand 9^ 
iiBottom" strand 100 

23 (14 c/g and 9 a/t) 
overlap ^ 

Net length 158 
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Table 28: DNA_seq2 



5'- |gcalcca|acg| 
I spacer | 



PCr/US89/03731 



[ CCT I AGG I AGG | CTC \ ACT | 
I Avr III 



S. D. 



liiilk|k|s|llv|l|klalsl 
I 1 I 2 1 3 I 4 I 5 I 6 I 7 I 8 I 9 1 lot 
I ATG I AAG 1 AAA I TCT I CTG I GTT 1 CTT 1 AAG I GCT 1 AGC 1 

I afl TTl Nhe I I 



ivja|vlaltll|v|plm|l| 
1 llj 12| 13 1 141 151 161 17| 181 19 1 20| 
I GTT 1 GCT I GTC | GCG | ACC | CTG ] GTA 1 CCG | ATG | CTG 1 
I Nru 1 1 I KPn 1 1 



lslflalrlpld|flcllle 
I 211 221 231 241 25] 26| 27 1 28] 29 | 30 
1 TCT I TTT 1 GCT 1 CGT 1 CCG 1 GAT | TTC 1 TGT 1 CTC 1 GAG 
lAccIIll 1 Ava I 



Xho I 



lPlplyltlglplc|k|alr 
I 31| 321 331 34| 35l 36l 37 1 38l 39l 40 
I CCG I CCA I TAT | ACT 1 GGG 1 CCC 1 TGC 1 AAA 1 GCG | CGC 
I PflM I I |BssH II 
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Table 28, continued. 



I Apa I I. 
I Dra II I 
I Pss I I 



I i I i I r 1 
I 41| 42| 43| 
I ate I ate I cgt I 



I t I s [ k I 
1127 1 128 1 1291 

lACT|TCG|AAalgcglgctlgcg| - 3' 
|Afiii II I spacer 1 



wo 90/02809 



232 



PCr/US89/03731 



Table 30: DNA_seq3 

I a I r 1 
I 39| 40| 
5'- lcccltgclacalGCG|CGCl 
I snac ^-^ |BssH III 

|ili|r|ylflyln|a|kl 

I 41| 42| 43l 44l 45} 46 | 47 | 48| 49) 
I ATC I ATC I CGT I TAT 1 TTC i TAG I AAC 1 GCT 1 AAA I 

la|g|llclq|tlf|vly|g|gl 

1 50| 51| 521 53| 54| 55| 56 | 57 j 58] 59 j 60| 
1 GCA 1 GGC t CTG [ TGC ] CAG 1 ACC | TTT j GTA 1 TAQ | GGT | GGT ] 
\ Stu II I ACC I I 

I Xca I I 



lclr|ai3clr|n|nlf|]c| 

1 61| 62] 63| 64| 65| 66l 67 | 68 | 69] 
1 TGC I CGT I GCT I AAG I CGT 1 AAC 1 AAC 1 TTT 1 AAA 1 

! ESP I L 



|slaje|d|clin|r|t|c|g| 
I 70] 711 721 73| 741 75 1 76l 77 1 78 1 79 1 
1 TCG 1 GCC I GAA 1 GAT 1 TGC 1 ATG | CGT 1 ACC 1 TGC 1 GGT | 
1 Xmalll I I Sph Ij 

I g I a 1 

1 80| 8ll 
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Table 30, continued. 

|GGClGCClgct|gaa| 
I RViA T \ spacer 
I war I I 



I t I s I k 1 

1127 1 128 1 129] 
|ttt|acT|TCG|AAa|gcg|tcglccg| - 3' 

|ASU III 



PCr/US89/03731 

WO 90/02809 ri^*/^ 
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Table 32: DNA_seq4 



lg|a|a|elgldld| 

5/ I 80| 8II 821 831 841 85l 86l 

1 cct I cgc 1 cct 1 GGC I GCC I GOT 1 GAA 1 GGT 1 GAT I C5AT 1 

I spacer | Bbe I I 
I Nar I I 



1 p 1 a 1 k 1 a 1 a 1 
1 87 1 881 891 90 1 9ll 
1 CCG 1 GCC 1 AAA 1 GCG 1 GCC 1 
\ Sfi I i 



lflnls|llq|a!slalt| 

I 92| 93| 941 951 96| ^"^ I 99|l00| 
I TTT I AAC 1 TCT 1 CTG 1 CAA 1 GCT | TCT 1 GCT | ACC 1 

[Hind 3 I 



lelyli|g|Y|a|wl 
1 101 1 102 1 103 1 104 1 105 1 106 1 107 1 
1 GAA 1 TAT 1 ATC | GGT 1 TAG 1 GCG ] TGG 1 

I Mlu 1 1 



1 a 1 m 1 V 1 V 1 V I 
1 108 1 1091 110 1 111 1 112 1 
1 GCC 1 ATG 1 GTG 1 GTG 1 GTT 1 

I BstX I I 

\ NCO I] 
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Table 32, continued. 
|i|vlg|a|t|i|g|i| 
1 113 1 114 1 115 1 116 1 117 1 118 1 119 1 120 | 
I ATC I GTT I GGT | GCT | ACC | ATC | GGT | ATC | 



|k|l|f|klk|f|t!s|k| 

1 121 1 122 1 123 1 124 1 125 1 126 1 127 ] 128 1 129 | 

|AAAlCTG|TTTlAAGlAAA|TTT|ACT|TCG|AAa|gcg|tcg|ggc| - 3' 

|Asu II I spacer L 
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Table 34: Some interaction sets in BPTI 



Nixmber 
Res. Dif f . - - 

^ X, BPTI 1 2 3 4 5 

J A As Cor ^^nts , 



•"O 




D -32 














2 


E -32 


- 










—J 


5 


T P F Z -29 


- 










— ^ 


10 


Z3 R3 Q2 T2 H G L K E -18 


- 










-1 




D4 T2 P2 Q2 E G N K R -18 


- 










1 


10 


R21 A2 K2 H2 P L I T G D 


R 








•J 


2 


9 


P20 R4 A2 H2 N E V F L 


P 






5 


5 


3 


10 


D15 K6 T3 R2 P2 S Y G A L 


D 






4 


s 


4 


7 


F19 D4 L3 Y2 12 A2 S 


F 
C 






s 

X 


5 
X 


5 


1 


C33 












6 


10 


Lll E5 N4 K3 Q2 12 ¥2 D2 T R 


L 






4 




7 


5 


L18 Ell K2 S Q 


E 




s 


4 




8 


7 


P26 H2 A2 I L G F 


P 




3 


4 




9 


9 


P17 A6 V3 R2 Q L K Y F 


P 




s 3 


4 




10 


10 


Yll E7 D4 A2 N2 R2 V2 S I D 


Y 


s 


s 


4 




11 


10 


T17 P5 A3 R2 I S Q Y V K 


T 


1 


s 3 


4 








G 


X 


X 


X 




12 


2 


G32 K 
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Table 34, continued. 



13 


5 


P22 R6 L3 N I 


ir 


X 






e 


14 


3 


C31 T A 


r> 


1 

X 




s s 


5 


15 


12 


kl5 R4 Y2 M2 L2 -2 V G A I N F 


K 


J. 


s 


J ft 




16 


7 


A22 G5 Q2 R K D F 




1 


s 


S S 


5 


17 


12 


R12 K5 A2 Y3 H2 S2 F2 L M T G P 


K 


1 
X 








18 


6 


121 M4 F3 L2 V2 T 


1 


X 


S 






19 


7 


111 PIO R6 S2 K2 L Q 


I 


1 


2 


3 


S 


20 


5 


R19 A7 S4 L2 Q 


R 


s 


s 






21 


4 


yi8 F13 W I 


TP 

y 










22 


6 


F14 Y14 H2 A N S 


F 




S 


O A 




23 


2 


Y32 F 


Y 










24 


4 


N26 K3 D3 S 


N 




s 


J 




25 


10 


A12 S5 Q3 P3 W3 L2 T2 K G R 


A 




s 


S 




26 


9 


K16 A6 12 E2 S2 R2 G H V 


K 




s 






27 


5 


A18 S8 K3 L2 T2 


A 




2 


O A- 

3 4 




28 


7 


G13 KIO N5 Q2 R H M 


r* 






S S 




29 


10 


L9 Q7 K7 A2 F2 R2 M G T N 


L 




2 


J 




30 


1 


C33 


C 




X 


X X 




31 


7 


Q12 Ell L4 K2 V2 Y N 


n 
W 






3 4 




32 


11 


T12 P5 K4 Q3 E2 L2 G V S R A 


X 






•J o 




33 


1 


F33 


F 


X 


X 


X X 




34 


11 


Vll 18 T3 D2 N2 Q2 F H P R K 


V 


1 


2 


3 s 




35 


2 


Y31 W2 


Y 


s 


s 


5 


5 


36 


3 


G27 S5 R 


G 


1 








37 


1 


G33 


G 


X 






X 


38 


3 


C31 T A 


C 


1 




s 


5 


39 


7 


R13 G9 K4 Q3 D2 P M 


R 


1 




4 


s 
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Table 34, continued. 




4U 




G22 All 


A s 






N20 Kll D2 


K 




q 


All R9 S4 G3 H2 D Q K N 


R 






N31 G2 


N 


44 


3 




N 






F32 Y 


F 


46 


Q 
O 


K24 E2 S2 D H V Y R 


K 


47 




T19 S14 


S 




9 


All 19 E4 T2 W2 L2 R K D 


A 




7 


E19 D6 A2 Q2 K2 T H 


E 


50 




E16 D12 L2 M Q K 


D 






C 


51 


J. 




M 


52 


/ 


R13 MIO L3 E3 Q2 H V 


53 


Q 
o 


R21 Q3 E2 H2 C2 G K D 


R 


A 


7 


T23 A3 V2 E2 I Y K 


T 




1 


C33 


C 




8 


G15 V8 13 E2 R2 A L S ' 


G 


-J 1 


8 


G19 V4 A3 P2 -2 R L N 


G 






All -10 P3 K3 S2 Y2 R F 


A 


59 


9 


-24 G2QEAYSPR 




60 


6 


-28 Q R I G D 




61 


3 


-31 T P 




62 


2 


-32 D 




63 


2 


-32 K 




64 


2 


-32 S 





S 5 
4 s 
s 5 
s 
s 
s 
5 

s 5 
2 s s 
2 s 
s 5 
X X 
2 s 
S 5 
5 
X 



s indicates secondary set 

X indicates in or close to surface but buried and/or highly 
conserved. 
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Table 35: 
Distances from Cbeta ^° 
Tip of Side Group 
in Angstroms 



Amino Acid type 
A 

C (reduced) 

D 

E 

F 

G 

H 

I 

K 

L 

H 

N 

P 

Q 

R 

S 

T 

V 

W 

Y 



Distance 
0.0 
1.8 
2.4 
3.5 
4.3 

4.0 

2,5 

5.1 

2.6 

3.8 

2.4 

2.4 

3.5 

6.0 

1.5 

1.5 

1.5 

5.3 

5.7 



^ainniated for Standard model parts 
Notes: These distances were calculated tor 

with all side groups fully extended. 
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Table 36: Distances, BPTI residue set #2 
Distances in Angstroms between Cbeta^' 
Hypothetical Cfaeta was added to each Glycine. 



R17 



119 Y21 A27 G28 L29 Q31 T32 V34 A48 



L29 
Q31 
T32 
V34 
A48 
E49 



119 7.7 

Y21 15.1 8.4 

A27 22.6 17.1 12.2 

G28 26.6 20.4 13.8 5.3 

22.5 15.8 9.6 5.1 5.2 
16.1 10.4 6.8 6.8 10.6 6.8 
11.7 5.2 6.1 12.0 15.5 10.9 5.4 

5.6 6.5 11.6 17.6 21.7 18.0 11.4 8.2 
18.5 11.0 5.4 12.6 13.3 8.4 8.8 8.3 15.7 
22.0 14.7 8.9 16.9 16.1 12.2 13.9 13.3 19.8 5.5 
16. ^ o ^ 10 ^ 7.6 11.3 1-^.? 20.0 6.2 

P9 14.0 11.3 9.0 12.2 15.4 13.3 7.9 9.2 8.7 13.9 
Til 9.5 11.2 13.5 18.8 22.5 19.8 13.5 12.1 5.7 18.5 
7.9 14.6 20.1 27.4 31.3 27.9 21.4 18.1 10.3 24.6 
5.5 10.1 15.9 25.2 28.5 24.6 18.6 14.5 8.6 19.8 
6.1 6.0 11.2 21.3 24.4 20.2 14.7 10.4 7.0 15.0 
R20 10.6 5.9 5.4 16.0 18.5 14.6 9.8 6.9 7.8 10.2 
15.6 10.9 5.6 10.5 12.8 10.3 6.2 8.1 10.8 10.3 
N24 19.9 14.7 9.4 4.1 7.3 6.1 4.8 10.0 14.7 11.4 
24.4 20.1 15.2 5.4 7.7 9.8 10.1 15.3 19.0 17.0 
C30 18.9 12.1 4.6 8.8 9.5 5.3 5.9 8.2 14.9 4.9 
F33 10.8 7.4 7.7 12.6 16.4 13.0 6.6 5.6 5.5 12.2 
Y35 8.4 7.4 9.4 18.4 21.4 17.9 12.2 9.5 5.8 14.4 
S47 17.6 10.6 6.6 17.3 17.9 13.4 12.6 10.4 15.9 5.3 
D50 20.0 13.6 7.2 17.2 16.8 13.5 13.5 12.9 17.6 7.6 
C51 18.9 12.2 4.0 12.1 12.2 8.8 8.8 9.7 15.3 5.4 
R53 25.4 18.6 11.0 17.2 15.0 13.0 15.7 16.7 22.3 9.7 
R39 15.4 16.9 17.1 24.9 27.2 24.9 20.1 18.7 13.8 22.3 



K15 
A16 
118 
R20 
F22 
N24 
K26 
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Table 36, continued. 
Distances in Angstroms between CbetaS- ^ 

M«%7. 6 . 1 

P9 17.7 15.5 

Til 22.1 21.5 7.2 

27.5 28.7 16.4 9.5 
22.2 24.2 14.9 9.8 6.2 
17.4 19.5 12.2 9.5 10.4 4.9 
13.0 13.8 8.0 9.4 14.9 10.6 6.2 
13.8 11.4 4.1 10.6 19.1 16.3 12.7 6.9 
N24 15.6 11.2 8.4 15.3 24.1 21.9 18.2 12.7 6.6 
Z 20.9 15.7 12.1 18.6 27.9 26.6 23.3 18.1 1 • • 

- - , on -i ^fi.^ 9.8 6.8 6.y 



K15 
A16 
118 
R20 
F22 



C30 8.7 5.6.10.6 16.6 24.1 20.2 15.7 9.8 

.33 16.5 15.4 4.2 7.1 IS.O 12.B ..6 . • • 

V35 17.2 17.8 7.8 5.8 11.0 7.6 4.9 

S47 4 7 9.1 15.3 18.5 23.1 17.6 12.8 9.1 12.0 15.3 

: .5 7.7 14.7 18.6 24.2 19.2 14.7 9.9 11.0 1 . 

C51 7.1 5.4 11.0 16.4 23.5 19.2 14.6 8.7 6.9 9.6 

Zl 6 5.6 17.9 23.1 29.6 24.8 20.3 15.0 13.S 1 . 

^39 23.9 24.0 13.0 9.5 12.0 11.8 12.5 12.3 14.7 20.8 

K26 C30 r33 Y35 S47 D50 C51 R53 

C30 12.4 

F33 13.9 10.1 

y35 19.5 13,5 6.4 

S47 21.0 8.8 13.5 13.2 

D50 20.1 8.6 14.3 13.7 5.0 

C51 15.0 3.7 10.9 12.5 6.9 5.2 

R53 19.9 9.9 18.2 18.8 9.4 5.8 7.4 
R39 



24'.3 20.6 14.4 9.6 20.4 19.0 18.8 23.4 



PCr/US89/03731 
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Table 37: vgDNA to vary BPTI set #2. 



.IrAcjCCT 


g 

35 
G66 


P 

36 
CCC 


c 
37 
TGC 


k 
38 
AAA 


a 
39 
GCG. 


X 
40 


1 spacer 









208 



41 
ATC 



+ 
X 
42 
.gfk 



r 
43 
CGT 



y 

44 

TAT 



f 

45 
TTC 



y 

46 
TAC 



+ I 

X I g 

50 51 



+ 
X 
52 
qfk 



c 
53 
TGC 



q 

54 
CAG 



n 
47 
AAC 



a 
48 
GCT 



k 
49 

AAA . 



235 



/ 



t I f 
55 56 
J ^CC I TTc 



iig#28= 3'- acg gtc tgg aag 



78 nts 



overlap = 12 (7 CG, 5 AT) 



3' 

+ 

57 
qfk 
**m 



= olig#27 72 nts 



y 

58 
TAC 
atg 



g 

59 
GGT 
cca 



g 

60 
GGT 
cca 



268 



c 
61 
TGC 



r 
62 
CGT 



a 
63 
GCT 



k 
64 
AAG 



r 
65 
CGT 



n 
66 
AAC 



n 
67 
AAC 



f 

68 
TTT 



k 

69 
AAA 



295 



acg gca 



cga ttc gca ttg ttg aaa ttt 



+ 

1 s I X- 1 e 

70 71 72 
TCT| qfkJGAG 



d 




la 




73 




75 




GAT 


TGC 


ATG 


c 


eta 


acg 


tac 


gca 



322 



-5' 



spacer 



k 

q 

f 
* 



equal parts of T and C; u = equal parts of C and A; 
(?26 T, .18 C, .26 A, and .30 G , 
22 T .16 C, .40 A, and .22 G) , 
Complement of symbol above 

Residue 40 42 50 52 57 ^ 71 ^ ^ 

Possibilxties 21 x 21 x 

Abundance x 10 ^ ,671 .500 .459 

of PPBD .768 .271 

produce = 1.77 x 10 ^ 

parent = 1/(5.5 x 10^) ^llflJ^ll^^^^n^^^^^^^ 
Least favored one-amxno-acid subscitu 

at 1 in 1.6 X 10' 
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Table 38: Result of varying set#2 of BPTI 2.1 



1 


e 


29 


30 


CTC 


GAG 


Ava I 


Xho I 



178 



P 
31 
CCG 



P 

32 

CCA 



y 

33 
TAT 



PflMI 



t 
34 
ACT 



g 

35 
GGG 



p 


c 


k 


a 


D 


36 


37 


38 


39 


40 


CCC 


TGC 


AAA 


GCG 


GAT 



Apa 




Dra 


II 


Pss 


I 



208 



1 

41 
ATC 



E 
50 
GAG 



C 
61 
TGC 



Q 
42 
CAG 



g 

51 
GGC 



r 
62 
CGT 



r 

43 
CGT 



L 
52 
CTG 



y 

44 

TAT 



c 

53 
TGC 



a 
63 
GCT 
Esp 



k 
64 



f 


y 


n 


a 


k 


45 


46 


47 


48 


49 


TTC 


TAG 


AAC 


GCT 


AAA 


q 


t 


f 


S 


y 


54 


55 


56 


57 


58 


CAG 


ACC 


TTT 


TCG 


TAC 


r 


n 


n 


f 


k 


65 


66 


67 


68 


69 


CGT 


AAC 


AAC 


TTT 


AAA 



g 

59 
GGT 



g 

60 
GGT 



235 



268 



295 



s 
70 
TCG 



w 


e 


d 


c 


in 


r 


t 


C 


g 


71 


72 


73 


74 


75 


76 


77 


78 


79 


TGG 


GAA 


GAT 


TGC 


ATG 


CGT 


ACC 


TGC 


GGT 






1 Sph ] 


a 









325 



g 


a 


80 


81 


GGC 


GCC 


Bbe I . 


Nar I _ 
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Table 39: vgDNA to vary set #2 BPTI 2.2 

+ 



5' - 



cq aca cac 


g 

35 
GGG 


P 
36 

CCC 


c 
37 
TGC 


2£ 
38 

mrA 


a 
39 
GCG 


D 
40 
GAT 


1 spacer 


Ap< 


a I 





X 
41 
rwA 


Q 
42 
CAG 


X 
43 
rvk 


X 
44 
TwT 


f 

45 
TTC 


y 

46 
TAC 




+ 






+ 


+ 


E 


X 


L 


C 


X 


X 


50 


51 


52 


53 


54 


55 


GAG 


qfk 


CTG 


TGC 


qfk 


qfk 



n 
47 
AAC 



a 
48 
GCT 



k 
49 
AAA 



f 

56 
TTT 



S 
57 
TCG 



y 

58 
TAC 



g 

59 
GGT 



g 

60 
GGT 



208 



235 



268 



91 nts olig#30 3'- g cca cca 
overlap =15 (11 CG, 4 AT) 







/- 


- 3' 


olig#29 


94 


nts 




c 


r 


a 


k 


r 


n 


n 


f 


k 


61 


62 


63 


64 


65 


66 


67 


68 


69 


TGC 


CGT 


GCT 


AAG 


CGT 


AAC 


AAC 


TTT 


AAA 


acg 


gca 


cga 




gca ttg ttg 


aaa 


ttt 






1 ESTD I 













295 



s 

70 
TCG 



W 
71 
TGG 



age acc 



+ 

X 

72 
qfk 
**in 



d 
73 
GAT 



c 
74 
TGC 



m 
75 
ATG 



eta acg tac gcg acc tgc -5' 
sph I] s pacer I 



k 
m 
w 

q 
f 

* 



equal parts of T and G; v = equal parts of C A and G; 

equal parts of C and A; r = equal parts of A and G, 

equal parts of A and T; 

(.26 T, .18 C, .26 A, and .30 G) ; 

(.22 T, .16 C, .40 A, and .22 G) ; 

complement of symbol above 



Residue 
Possibilities 



38 41 43 44 51 54 55 72 
4X 4X 9X 2X21X21X21X21 . 

= 6.2 X 10 
.397 .437 .602 



Abundance x 10 2.5 2.5 .833 5. .663 
Product = 2.3 X 10"^ 

Parent = 1/(4.4 x lo'^) least favored = 1/(1.25 x 10^) 
Salt favoild one-amino-acid substitution from PPBD present 
at 1 in 1.2 X lo'' 
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Table 40; Result of varying set#2 of BPTI 2.2 



1 
29 
CTC 



e 

30 
GAG 



178 



p 

31 


P 
32 
CCA 
Pi 


y 

33 
TAT 

'IM : 


t 
34 
ACT 


g 

35 
GGG 

1 


P 
36 
CCC 


c 
37 
TGC 


E 
38 
GAG 


a 
39 
GCG 


D 
40 
GAT 


208 








J 


Ana I 1 












V 
41 
GTT 


Q 
42 
CAG 


N 
43 
AAT 


F 
44 
TTT 


f 

45 
TTC 


y 

46 
TAC 


n 
47 
AAC 


a 
48 
GCT 


k 
49 
AAA 




235 


E 
50 
GAG 


F 

51 
TTT 


L 
52 
CTG 


c 
53 
TGC 


S 

54 
TCT 


A 
55 
GCT 


f 

56 
TTT 


S 

57 
TCG 


y 

58 
TAC 


g 

59 
GGT 


g 

60 

GGT 268 


C 

61 
TGC 


r 
62 
CGT 


a 
63 
GCT 


k 
64 
AAG 


r 

65 
CGT 


n 
66 
AAC 


n 
67 
AAC 


f 

68 
TTT 


k 
69 
AAA 




295 



S 

70 
TCG 



W 
71 
TGG 



Q 
72 
CAG 



d 
73 
GAT 



c 
74 
TGC 



m 
75 
ATG 



r 
76 
CGT 



t 
77 
ACC 



C 

78 
TGC 



g 

79 
GGT 



325 



g 


a 


80 


81 


GGC 


GCC 


Bbe I 


Nar I ^ 



4 
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Table 41: vg DNA set#2 of BPTI 2.3 





1 


e 




29 


30 


eq acre eta 


CTC 


GAG 


1 spacer 


Xho I 



178 



p 

31 
CCG 


32 
vmcT 


Y 
33 
TAT 


34 

vmCT 


g 

35 
GGG 


P 
36 
CCC 


c 
37 
TGC 


E 
38 
GAG 


a 
39 
GCG 


X 
40 
qfk 


V 
41 
GTT 


Q 
42 
CAG 


N 
43 
AAT 


+ 
X 
44 
Tdk 


f 

45 
TTC 


y 

46 

TAG 


n 
47 
AAC 


a 
48 
GCc 


k 
49 
AAq 


-3' 


67 ni 


ts o: 


Lig#: 


34 3 


g atg ttg egg 


tte 





208 



olig#33 71 nts 



overlap =13 (7 CG, 6 AT) 



+ 




+ 






+ 




+ 










F 


X 


c 


S 


X 


f 


X 


y 


g 


g 


50 


51 


52 


53 


54 


55 


56 


57 


58 


59 


60 


vAG 


TTT 


nTk 


TGC 


TCT 


qfk 


TTT 


qfk 


TAC 


GGT 


GGT 


btc 


aaa 


nam 


aeg 


aga 


**m 


aaa 


**m atg 


cca 


eca 


c 


r 


a 


k 
















61 


62 


63 


64 
















TGC 


CGT 


GCT 


AAG 


C 















268 



ESP I 



spacer 



k 
w 
d 

q 
f 

* 



equal parts of T and G; m = equal parts of C and A; 
equal parts of A and T; n = equal parts of A,C,G,T; 
equal parts A,G,T; v = equal parts A,C,G; 

(.26 T, .18 C, .26 A, and .30 G) ; 
(.22 T, .16 C, .40 A, and .22 G) ; 
complement of symbol above 

Residue 32 34 40 44 50 52 55 57 

Possibilities 6x 6x2lx 6x 3x 5x21x21- 

3 X 10' 

Abundance x 10 , 

of PPBD 10/6 10/6 .545 10/6 10/3 30/8 .459 .701 

product = 1.01 x 10"'' 

parent = 1/(1 x lO^) least favored = 1/(4 x lO^) 

Least favored one-amino-acid substitution from PPBD present 

at 1 in 3 X lo'' 
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Table 42: Result of varying set#2 of BPTI 2.3 
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CLAIMS 

1. A method of obtaining a protein that binds a 
predetermined target that comprises: 

5 

a) preparing a variegated population of replicable 
genetic packages, each package including a nucleic 
acid construct coding on expression for an outer- 
surface-displayed potential binding protein other 

10 than a single chain antibody comprising (i> a 

structural signal directing the display of the 
protein on the outer surface of the package and 
(ii) a potential binding domain for binding said 
target, where a plurality of different potential 

15 binding domains are displayed by said population, 

b) causing the expression of said proteins and the 
display of said proteins on the outer surface of 
such packages, 

c) contacting the packages with target material so 
that the potential binding domains of the proteins 
and the target material may interact, and 
separating packages bearing a binding domain that 
binds target material from packages that do not so 
bind, and 

d) recovering and replicating at least one package 
bearing a successful binding domain, 

preferably further comprising (e) determining the 
amino acid sequence of a successful binding 
domain. 
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35 



and more preferably, further comprising (f) 
preparing a new variegated population of 
replicable genetic packages according to step (a) , 
the parental potential binding domain for the 
potential binding domains of said new packages 
being a successful binding domain whose sequence 
was determined in step (e) , and repeating steps 
(b)-(e) with said new population. 

The method of claim 1 wherein the population of 
replicable genetic packages of step (a) xs 
obtained by: 

i) preparing a variegated population of DNA 
inserts of each of which comprises a first 
sequence which codes on expression for a potential 
binding domain and, a second sequence encoding 
signal directing that the encoded protein be 
displayed on the outer surface of a chosen 
replicable genetic package, and 

ii) incorporating the resulting population of DNA 
constructs into the chosen replicable genetic 
packages to produce a population of replicable 
genetic packages, 

Wherein preferably (D said population is 
characterized by the display of at least 10^ but 
not more than 10^ different potential binding 
domains and/or (2) from 1 in 10^ to l in 10^ of 
the packages of said population display the same 
potential binding domain. 

The method of claim 1 wherein, in step (a), the 
potential binding domains encoded by the nucleic 
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acid constructs are each related in sequence to a 
parental potential binding domain by a limited 
number of amino acid substitutions in the amino 
acid sequence of said parental potential binding 

5 domain, and, preferably the level of variegation 

of the population is chosen such that the packages 
displaying potential binding domains obtained by 
single amino acid substitutions in the amino acid 
sequence of the parental potential binding domain 

10 are present in detectable amounts, and preferably 

the initially chosen parental potential binding 
protein has at least one stable binding domain and 
said domain has a melting point of at least 60<=>C 
and is stable over a pH range of at least 3.0-8.0. 

15 

4. The method of claim 1 wherein the displayable 
potential binding protein is a chimeric protein, 
and preferably, wherein said signal is provided 
by a segment of said chimeric protein which is 

20 essentially identical in amino acid sequence with 

at least a functional portion of a natural outer 
surface protein encoded by said genetic package or 
a cell naturally infected by said genetic package, 
said portion directing the transport of said 

25 chimeric protein to the outer surface of the 

genetic package. 

5. The method of claim 3 wherein the parental 
potential binding domain is initially chosen to be 
30 one which is over 50% homologous with a domain of 

a known protein, the latter domain having a 
melting point of at least about 60°C. 
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The method of claim 5 wherein the initially chosen 
parental binding protein does not preferentially 
bind the predetermined target. 

The method of claim 3, said target material 
comprising one or more discrete molecules, said 
parental potential binding domain being 
Characterized as a sequence of amino acids 
further comprising identifying an interaction set 
of amino acids which are on the surface of the 
parental potential binding domain and which can 
all simultaneously touch a single molecule of the 
target material, and obtaining potential binding 
domains by substituting a different amino acid for 
- one or more of the amino acids in said interaction 
set. 

The method of "claim 1 wherein the target material 
is a non-macromolecular organic compound and the 
potential binding domains comprise greater than 
about 80 amino acid residues. 

The method of claim 1 wherein the target material 
is a non-macromolecular organic compound and the 
potential binding domains comprise greater than 
about 80 amino residues. 

The method of claim 1 wherein the target material 
is a mineral insoluble in aqueous solution. 

The method of claim 1 wherein the target is an 
inorganic molecule or complex ion that is stable 
in aqueous solution. 

12. The method of claim 1 wherein the target is an 
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organometallic compound that is stable in aqueous 
solution. 

13. The method of claim 1 wherein the target material 
is a general protease, wherein the immobilized 
target material is first incubated with an 
irreversible or covalent inhibitor to inactivate 
the protease, 

XO 14. The method of claim 1 wherein the replicable 
genetic package is a cell or virus that can be 
affinity separated and retain viability. 

The method of claim 5 wherein the known binding 
protein is an enzyme, the activity of which has a 
deleterious effect on the replicable genetic 
package, the host of the replicable genetic 
package, or the target, wherein the majority of 
the nucleic acid constructs code on expression or 
an analogue of the known binding protein that does 
not have such deleterious enzymatic activity. 

The method of claim 1 wherein the target contains 
ionizable groups and the pH of the solutions of 
the intended use and the pH of the affinity 
separations are chosen so that both the potential 
binding protein and the target remain stable. 

The method of claim 1 wherein the target contains 
ionizable groups, further comprising providing 
counter ions to reduce electrostatic repulsion 
between the potential binding protein and the 
target . 
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18 The nathod of claim 1 wh^ein the initial 
potential binding domain is pi*ed so that under 
the conditions of intended use of the d-xred 

^ • or,^ under the conditions or 
binding protein and vinaer 

cmamg p potential binding 

B affinity separation, tnat i:ne p 

domains and the target will either have opposite 
charge or one of them will be neutral. 
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The method Of claim 28 wherein the replicable 
genetic package is a bacterial cell, such as 
a strain of pischerichia coli- 

20 The method of claim 1 wherein the replicable 
genetic package is a bacterial spore such as 

! lacillu^ endospore, more preferably an endospore 
of a strain of sufetUis. 

21 The method of claim 1 wherein the replicable 
genetic package is a bacteriophage, such as a 
filamentous phage, preferably a derivative of an 
M13 Escherichia coli bacteriophage or derivative 
of the Pseudomonas aerugitissa filamentous phage 



Pfl. 



25. 22 



30 23. 



The method of claim 21 wherein the signal is 
provided by the coat protein of M13 or a segment 
thereof embodying an outer surface transport 
signal . 

The method of claim 21 wherein the signal is 
provided by the gene III protein of M13 or a 
segment thereof embodying an outer surface 
transport signal. 
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24. The method of claim 2 wherein the distribution of 
nucleotides incorporated at each variegated codon 
is chosen to yield substantially equal abundances 
of acidic and basic amino acids, and, preferably 
5 the distribution of nucleotides incorporated at 

each variegated codon is further chosen to yield 
the largest value for the quantity 1(1.- 
abundance(stop codons)) times (abundance of the 
least abundant amino acid) /(abundance of the most 
10 abundant amino acid)}. 

25 The method of claim 1, wherein step (c) further 
comprises contacting the packages with a second 
material and isolating packages which do not bind 
15 that second material. 

The method of claim 1, wherein after obtaining a 
novel binding -protein recognizing a first 
predetermined target, the novel binding protein is 
chosen as a parental potential binding protein for 
the isolation of a derivative protein which also 
binds "to a second predetermined target. 

The method of claim 3 wherein the initially chosen 
parental potential binding domain is selected from 
the group consisting of (a) binding domains of 
bovine pancreatic trypsin inhibitor, crambin, 
ovomucoid, T4 lysozyme, hen egg white lysozyme, 
ribonuclease, and azurin, and (b) domains at least 
50% homologous with any of the foregoing domains 
and which have a melting point of at least 60°C. 
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The method of cXal» 36 wherein the cuter surface 
transport signal Is providea by the protexn 
or a segment thereof embodying an outer surface 
transport signal. 

The method of claim 38 wherein the outer surface 
transport signal is provided by the 
cote or cotD protein or a segment thereof 
embodying an outer surface transport signal. 

" 30. A chimeric protein comprising (i) at least a 
segment of an outer surface protexn of a cell or 
viL, said segment providing an outer surface 
transport signal recognized by said cell or virus 
and (ii) a domain foreign to said outer surface 
protein, and, preferably, said foreign doma n 
hinds to a target material not preferentially 
bound by said outer surface protein. 
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31 A replicable genetic package which contains a 
nucleic acid construct which codes on expression 
for the chimeric protein of claim 30. 

32; The method of claim 1 wherein in at least one 
instance the amino acid residues varxed m a first 
assortment of potential binding domains are left 
constant in the next assortment of potential 

binding domains. 

33. A method Of preparing a population of variegated 
DNA Wherein the distribution of nucleotides 
incorporated at each variegated codon is chosen to 
yield substantially equal abundances of acidic and 
basic amino acids, and, preferably, the 
distribution of nucleotides incorporated at each 
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variegated codon is further chosen to yxeld the 
largest value for the quantity ( cl.-abundance(stop 
cod!ns)) times (abundance of the least abundant 
amino acid) /(abundance of the most abundant amino 
acid) } . 

34 The protein of claim 66, wherein the protein 
comprises a first foreign domain recognizing a 
first target material and a second foreign domain 
recognizing a second target material. 

The method of claim 3 wherein the initially chosen 
parental potential binding domain is at least 50. 

,4 4->, +-h*» bindina domain of bovine 
homologous wxth the Dinuiny 

15 pancreatic trypsin inhibitor. 
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CHOOSE GENETIC PACKAGE , 
OUTER SURFACE PROTEIN, 
OCV AND IPBD 

1 

SET PPBD * IPBD 



CHOOSE RESIDUES TO VARY 

t 

SYNTHESIZE V6 DNA, CLONE INTO OCV, ft 
INTRODUCE INTO wtGP TO OBTAIN GP I pbd) 
I 

CAUSE GPs TO EXPRESS AND DISPLAY PBDs 



►USE AFFINITY SEPARATION TO ISOLATE 
GP(SBD)s FROM OTHER 6P { PBD)s 
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